public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
       [not found] <bug-40893-4@http.gcc.gnu.org/bugzilla/>
@ 2010-10-05 18:14 ` paul at pwsan dot com
  2010-10-08 13:05 ` rearnsha at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: paul at pwsan dot com @ 2010-10-05 18:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893

paul walmsley <paul at pwsan dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |paul at pwsan dot com

--- Comment #2 from paul walmsley <paul at pwsan dot com> 2010-10-05 18:14:35 UTC ---
Here's a minimal test case:

void foo(unsigned int v)
{
  *(volatile unsigned short *)0xabcdefab = (v);
}

arm-linux-gcc  -O2 -march=armv7-a -c test.c; arm-linux-objdump -DS test.o 
| less


00000000 <foo>:
   0:   e30e3fff        movw    r3, #61439      ; 0xefff
   4:   e34a3bcd        movt    r3, #43981      ; 0xabcd
   8:   e6ff0070        uxth    r0, r0
   c:   e14305b4        strh    r0, [r3, #-84]
  10:   e12fff1e        bx      lr


As David notes, the expected behavior is that the uxth should not be generated
for >= armv6 targets, and the two shifts should not be generated on < armv6
targets, as they should be superfluous.

http://marc.info/?l=linux-omap&m=128630215909798&w=2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
       [not found] <bug-40893-4@http.gcc.gnu.org/bugzilla/>
  2010-10-05 18:14 ` [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily paul at pwsan dot com
@ 2010-10-08 13:05 ` rearnsha at gcc dot gnu.org
  2010-10-08 17:17 ` paul at pwsan dot com
  2021-12-13  8:55 ` [Bug rtl-optimization/40893] " pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 6+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2010-10-08 13:05 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893

Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rearnsha at gcc dot gnu.org

--- Comment #3 from Richard Earnshaw <rearnsha at gcc dot gnu.org> 2010-10-08 13:04:49 UTC ---
(In reply to comment #2)
> Here's a minimal test case:
> 
> void foo(unsigned int v)
> {
>   *(volatile unsigned short *)0xabcdefab = (v);
> }
> 
>

The compiler has to be extremely conservative with this code as it has a
volatile memory reference to deal with.  It must take extreme care not to
modify that operation and one consequence of this is that it is then difficult
to remove the narrowing operation.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
       [not found] <bug-40893-4@http.gcc.gnu.org/bugzilla/>
  2010-10-05 18:14 ` [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily paul at pwsan dot com
  2010-10-08 13:05 ` rearnsha at gcc dot gnu.org
@ 2010-10-08 17:17 ` paul at pwsan dot com
  2021-12-13  8:55 ` [Bug rtl-optimization/40893] " pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 6+ messages in thread
From: paul at pwsan dot com @ 2010-10-08 17:17 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893

--- Comment #4 from paul walmsley <paul at pwsan dot com> 2010-10-08 17:17:24 UTC ---
The bug also appears without volatile:

----

/* generates an unnecessary uxth */
void foo(unsigned short a, unsigned short b, unsigned short c,
     unsigned short *e, unsigned short *f)
{
    *e = (a + b) + c;
    *f = (a + b) - c;
}

/* works as expected */
void bar(unsigned short a, unsigned short b, unsigned short c,
     unsigned short *e)
{
    *e = (a + b) + c;
}

/* works as expected */
void baz(unsigned short a, unsigned short b, unsigned short c,
     unsigned short *e)
{
    *e = (a + b) - c;
}

-----

compiled and dumped with:

arm-linux-gnueabi-gcc -O2 -c test.c ; objdump -DS test.o

produces:

-----

Disassembly of section .text:

00000000 <foo>:
   0:   e0811000        add     r1, r1, r0
   4:   e1a01801        lsl     r1, r1, #16
   8:   e1a01821        lsr     r1, r1, #16
   c:   e0620001        rsb     r0, r2, r1
  10:   e0821001        add     r1, r2, r1
  14:   e1c310b0        strh    r1, [r3]
  18:   e59d3000        ldr     r3, [sp]
  1c:   e1c300b0        strh    r0, [r3]
  20:   e12fff1e        bx      lr

00000024 <bar>:
  24:   e0811000        add     r1, r1, r0
  28:   e0822001        add     r2, r2, r1
  2c:   e1c320b0        strh    r2, [r3]
  30:   e12fff1e        bx      lr

00000034 <baz>:
  34:   e0811000        add     r1, r1, r0
  38:   e0622001        rsb     r2, r2, r1
  3c:   e1c320b0        strh    r2, [r3]
  40:   e12fff1e        bx      lr

-----

gcc -v:

Using built-in specs.
Target: arm-linux-gnueabi
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-2'
--with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.4 --enable-shared --enable-multiarch
--enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/arm-linux-gnueabi/include/c++/4.4.5
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-objc-gc --disable-sjlj-exceptions --enable-checking=release
--program-prefix=arm-linux-gnueabi- --includedir=/usr/arm-linux-gnueabi/include
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=arm-linux-gnueabi
--with-headers=/usr/arm-linux-gnueabi/include
--with-libs=/usr/arm-linux-gnueabi/lib
Thread model: posix
gcc version 4.4.5 (Debian 4.4.5-2)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug rtl-optimization/40893] ARM and PPC truncate intermediate operations unnecessarily
       [not found] <bug-40893-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2010-10-08 17:17 ` paul at pwsan dot com
@ 2021-12-13  8:55 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-13  8:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
           Keywords|                            |missed-optimization
          Component|middle-end                  |rtl-optimization

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
  2009-07-28 16:28 [Bug middle-end/40893] New: " lessen42+gcc at gmail dot com
  2009-07-28 22:28 ` [Bug middle-end/40893] " lessen42+gcc at gmail dot com
@ 2009-09-09 16:34 ` ramana at gcc dot gnu dot org
  1 sibling, 0 replies; 6+ messages in thread
From: ramana at gcc dot gnu dot org @ 2009-09-09 16:34 UTC (permalink / raw)
  To: gcc-bugs



-- 

ramana at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2009-09-09 16:33:48
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily
  2009-07-28 16:28 [Bug middle-end/40893] New: " lessen42+gcc at gmail dot com
@ 2009-07-28 22:28 ` lessen42+gcc at gmail dot com
  2009-09-09 16:34 ` ramana at gcc dot gnu dot org
  1 sibling, 0 replies; 6+ messages in thread
From: lessen42+gcc at gmail dot com @ 2009-07-28 22:28 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from lessen42+gcc at gmail dot com  2009-07-28 22:27 -------
More specifically, on x86_64 the following is generated with gcc-4.4 -O3
-march=core2 -S
_dct2x2dc_dconly:
        movswl  2(%rdi),%edx
        pushq   %rbp
        addw    (%rdi), %dx
        movswl  6(%rdi),%eax
        movq    %rsp, %rbp
        addw    4(%rdi), %ax
        leal    (%rax,%rdx), %ecx
        subw    %ax, %dx
        movw    %cx, (%rdi)
        movw    %dx, 2(%rdi)
        leave
        ret

So it seems that the optimizer realizes that you don't need registers larger
than 16-bits, which allows memory operands on x86, which is optimal for this
case. However, other architectures follow this too literally, wasting
instructions to truncate intermediate results to 16 bits.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-13  8:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-40893-4@http.gcc.gnu.org/bugzilla/>
2010-10-05 18:14 ` [Bug middle-end/40893] ARM and PPC truncate intermediate operations unnecessarily paul at pwsan dot com
2010-10-08 13:05 ` rearnsha at gcc dot gnu.org
2010-10-08 17:17 ` paul at pwsan dot com
2021-12-13  8:55 ` [Bug rtl-optimization/40893] " pinskia at gcc dot gnu.org
2009-07-28 16:28 [Bug middle-end/40893] New: " lessen42+gcc at gmail dot com
2009-07-28 22:28 ` [Bug middle-end/40893] " lessen42+gcc at gmail dot com
2009-09-09 16:34 ` ramana at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).