[Bug target/42498] New: GCC can't use smull to compute int * int --> long long

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/42498]  New: GCC can't use smull to compute int * int --> long long
@ 2009-12-25  9:12 carrot at google dot com
  2009-12-28 17:59 ` [Bug target/42498] " markusl dot se78 at gmail dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: carrot at google dot com @ 2009-12-25  9:12 UTC (permalink / raw)
  To: gcc-bugs

Compile following code with options -O2

extern "C" void foo(long long a, long long b, long long c);

extern "C" void bar(int a, int b, int c, int d)
{
      long long x = (long long)a*b;
      long long y = (long long)b*c;
      long long z = (long long)c*d;
      foo(x,y,z);
}


gcc generates:

bar:
        stmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
        .save {r4, r5, r6, r7, r8, r9, sl, fp, lr}
.LCFI0:
        mov     fp, r1
        mov     ip, fp, asr #31
        mov     lr, r2
        mov     r4, r0
        mov     r5, r4, asr #31
        mov     sl, ip
        mov     r6, r3
        mul     r9, r1, r5
        mov     r3, r2, asr #31
        mul     sl, lr, sl
        mul     r8, r6, r3
        mov     r7, ip
        mov     fp, r6
        mov     ip, fp, asr #31
        umull   r4, r5, r1, r0
        mla     r7, r0, r7, r9
        mla     r0, r1, r3, sl
        umull   r2, r3, lr, r1
        mov     r1, ip
        mla     r1, lr, r1, r8
        umull   fp, ip, r6, lr
        add     r5, r7, r5
        add     r1, r1, ip
        mov     ip, r1
        .pad #12
        sub     sp, sp, #12
.LCFI1:
        add     r3, r0, r3
        mov     r1, r5
        mov     r0, r4
        stmia   sp, {fp-ip}
        bl      foo
        add     sp, sp, #12
        ldmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
        bx      lr

An ideal result code should use the ARM instruction smull to do the operation 
(int * int --> long long), but gcc first sign extends the operands, then do the
complex double word multiplication.

Following are some analysis from Seongbae Park:


============== From Seongbae Park ==========================

Looks like rtl expander/combiner tried its best to screw this case, even though
there's a pattern for smull for exactly this case,
and then it's a complete downhill from there.


*.final_cleanup:

void bar(int, int, int, int) (a, b, c, d)
{
 long long int D.1732;
 long long int D.1731;

<bb 2>:
 D.1731 = (long long int) b;
 D.1732 = (long long int) c;
 foo (D.1731 * (long long int) a, D.1732 * D.1731, (long long int) d * D.1732)
[tail call];
 return;

}

This input, when reaches the RTL expander, causes it to do a full 64x64->64bit
multiply,
instead of 32x32->64 multiply, and once that full sequence is expanded,
we don't have pass(es) that can clean up unnecessary multiplies/pluses
so the RTL isn't in a shape the combiner can turn into smull insn.

One possible solution is to make expander look at more of the tree/gimple that
it currently does.
e.g. if it could see that D.1731 and D.1732 are originally int values,
it can recognize that those 64-bit multiplies are actually 32x32->64 multiplies
and use smull at the expansion time.

I wish there's an easy way to slap BURG into gcc, and we could fix this kind of
problem once and for all :(


In this case, it's not so much the registers, but how the IR looks
like when it reaches RTL expander.
In particular:

D.1731 = (long long int) b;
D.1732 = (long long int) c;
foo (D.1731 * (long long int) a, D.1732 * D.1731, (long long int) d *
D.1732) [tail call];

Those extra names, D.1731 and D.1732 hide the fact that they are
originally 32-bit from the expander,
thus the expander isn't aware that this is a 32x32->64 multiply.

One way to show is slightly modified example:

extern "C" void foo(long long a, long long b, long long c);

extern "C" void bar(int a, int b, int c, int d, int e, int f)
{
 long long x = a*b;
 long long y = c*d;
 long long z = e*f;
  foo(x, y, z);
}


At the end of midle-end, the above source becomes:

void bar(int, int, int, int, int, int) (a, b, c, d, e, f)
{
<bb 2>:
 foo ((long long int) (b * a), (long long int) (d * c), (long long
int) (f * e)) [tail call];
 return;

}

And this exposes everything expander needs to do its job properly,
and leads to the following assembly:

      stmfd   sp!, {r4, r5}
      .save {r4, r5}
.LCFI0:
      mul     r1, r0, r1
      ldr     r4, [sp, #8]
      ldr     r0, [sp, #12]
      mul     r3, r2, r3
      mul     r4, r4, r0
      mov     r2, r3
      mov     r3, r2, asr #31
      mov     r5, r4, asr #31
      mov     r0, r1
      mov     r1, r0, asr #31
      strd    r4, [sp, #8]
      ldmfd   sp!, {r4, r5}
      b       foo
.LFE2:

Nice and clean, even though there are more variables involved.


Sorry the example was missing the cast.

A proper example:

extern "C" void foo(long long a, long long b, long long c);

extern "C" void bar(int a, int b, int c, int d, int e, int f)
{
 long long x = (long long)a*b;
 long long y = (long long)c*d;
 long long z = (long long)e*f;
  foo(x, y, z);
}

and this leads to:

void bar(int, int, int, int, int, int) (a, b, c, d, e, f)
{
<bb 2>:
 foo ((long long int) b * (long long int) a, (long long int) d * (long long
int) c, (long long int) f * (long long int) e) [tail call];
 return;

}


and finally:

     stmfd   sp!, {r4, r5}
      .save {r4, r5}
.LCFI0:
      smull   r0, r1, r1, r0
      ldr     r4, [sp, #12]
      ldr     ip, [sp, #8]
      smull   r2, r3, r3, r2
      smull   r4, r5, r4, ip
      strd    r4, [sp, #8]
      ldmfd   sp!, {r4, r5}
      b       foo
.LFE2:

=============== END ===============================


-- 
           Summary: GCC can't use smull to compute int * int --> long long
           Product: gcc
           Version: 4.5.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: carrot at google dot com
 GCC build triplet: i686-linux
  GCC host triplet: i686-linux
GCC target triplet: arm-eabi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42498


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/42498] GCC can't use smull to compute int * int --> long long
  2009-12-25  9:12 [Bug target/42498] New: GCC can't use smull to compute int * int --> long long carrot at google dot com
@ 2009-12-28 17:59 ` markusl dot se78 at gmail dot com
  2009-12-29 23:21 ` ramana at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: markusl dot se78 at gmail dot com @ 2009-12-28 17:59 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from markusl dot se78 at gmail dot com  2009-12-28 17:59 -------
Another solution might be to have a define_insn_and_split for a 64x64->64bit
multiply allowing the combiner to do its job combining into a widening
multiplication and if this fails the insn will later be split into the double
word multiply sequence. Anyway this is what I had in mind for fixing the same
issue with a private port (based on gcc 4.4.1). Are there any obvious drawbacks
of such an approach (besides that it is local and needs to be applied for every
target that sees the problem)?


-- 

markusl dot se78 at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |markusl dot se78 at gmail
                   |                            |dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42498


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/42498] GCC can't use smull to compute int * int --> long long
  2009-12-25  9:12 [Bug target/42498] New: GCC can't use smull to compute int * int --> long long carrot at google dot com
  2009-12-28 17:59 ` [Bug target/42498] " markusl dot se78 at gmail dot com
@ 2009-12-29 23:21 ` ramana at gcc dot gnu dot org
  2009-12-31 15:31 ` rguenth at gcc dot gnu dot org
  2010-04-22 10:01 ` bernds at codesourcery dot com
  3 siblings, 0 replies; 5+ messages in thread
From: ramana at gcc dot gnu dot org @ 2009-12-29 23:21 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from ramana at gcc dot gnu dot org  2009-12-29 23:21 -------
This looks like a DUP of PR29274.

*** This bug has been marked as a duplicate of 29274 ***


-- 

ramana at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42498


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/42498] GCC can't use smull to compute int * int --> long long
  2009-12-25  9:12 [Bug target/42498] New: GCC can't use smull to compute int * int --> long long carrot at google dot com
  2009-12-28 17:59 ` [Bug target/42498] " markusl dot se78 at gmail dot com
  2009-12-29 23:21 ` ramana at gcc dot gnu dot org
@ 2009-12-31 15:31 ` rguenth at gcc dot gnu dot org
  2010-04-22 10:01 ` bernds at codesourcery dot com
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-12-31 15:31 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from rguenth at gcc dot gnu dot org  2009-12-31 15:30 -------
This is because widening multiplication is not detected if there are
more-than-once used sub-expressions.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42498


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/42498] GCC can't use smull to compute int * int --> long long
  2009-12-25  9:12 [Bug target/42498] New: GCC can't use smull to compute int * int --> long long carrot at google dot com
                   ` (2 preceding siblings ...)
  2009-12-31 15:31 ` rguenth at gcc dot gnu dot org
@ 2010-04-22 10:01 ` bernds at codesourcery dot com
  3 siblings, 0 replies; 5+ messages in thread
From: bernds at codesourcery dot com @ 2010-04-22 10:01 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from bernds at codesourcery dot com  2010-04-22 10:01 -------
Fixed now.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42498


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-04-22 10:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-12-25  9:12 [Bug target/42498] New: GCC can't use smull to compute int * int --> long long carrot at google dot com
2009-12-28 17:59 ` [Bug target/42498] " markusl dot se78 at gmail dot com
2009-12-29 23:21 ` ramana at gcc dot gnu dot org
2009-12-31 15:31 ` rguenth at gcc dot gnu dot org
2010-04-22 10:01 ` bernds at codesourcery dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).