public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/102169] New: powerpc64 int memory operations using FP instructions
@ 2021-09-02  2:44 npiggin at gmail dot com
  2021-09-28 18:25 ` [Bug target/102169] " segher at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: npiggin at gmail dot com @ 2021-09-02  2:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102169

            Bug ID: 102169
           Summary: powerpc64 int memory operations using FP instructions
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: npiggin at gmail dot com
                CC: bergner at gcc dot gnu.org
  Target Milestone: ---
            Target: powerpc64le-linux-gnu

--- test.c ---
int foo, bar;

void test(void)
{
        foo = bar;
}
---

Using Debian gcc 10.2 with -O2 flag, this compiles to:

     addis   r2,r12,0
     addi    r2,r2,0
     addis   r9,r2,0
     addi    r9,r9,0
     lfiwzx  f0,0,r9
     addis   r9,r2,0
     addi    r9,r9,0
     stfiwx  f0,0,r9
     blr

Peter confirmed it also uses FP registers with trunk (but I don't have the asm
output at hand).

This can be suboptimal on some processors, e.g., on POWER9 lfiwzx is "Tuple
Restricted (R)" which reduces dispatch throughput on the cycle it is
dispatched. And generally just seems like a surprising thing to do with no
shortage of GPRs.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102169] powerpc64 int memory operations using FP instructions
  2021-09-02  2:44 [Bug c/102169] New: powerpc64 int memory operations using FP instructions npiggin at gmail dot com
@ 2021-09-28 18:25 ` segher at gcc dot gnu.org
  2021-09-28 19:54 ` bergner at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: segher at gcc dot gnu.org @ 2021-09-28 18:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102169

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |segher at gcc dot gnu.org
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-09-28
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Confirmed.  The unwanted code happens with -mcpu=power8 and later.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102169] powerpc64 int memory operations using FP instructions
  2021-09-02  2:44 [Bug c/102169] New: powerpc64 int memory operations using FP instructions npiggin at gmail dot com
  2021-09-28 18:25 ` [Bug target/102169] " segher at gcc dot gnu.org
@ 2021-09-28 19:54 ` bergner at gcc dot gnu.org
  2021-09-28 19:58 ` bergner at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bergner at gcc dot gnu.org @ 2021-09-28 19:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102169

--- Comment #2 from Peter Bergner <bergner at gcc dot gnu.org> ---
So we have the following during IRA:

(insn 7 2 8 2 (set (reg:SI 120 [ barD.3297 ])
        (mem/c:SI (plus:DI (unspec:DI [
                        (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
                        (reg:DI 2 2)
                    ] UNSPEC_TOCREL)
                (const_int 4 [0x4])) [1 barD.3297+0 S4 A32])) "bug.i":5:13 544
{*movsi_internal1}
     (expr_list:REG_EQUIV (mem/c:SI (plus:DI (unspec:DI [
                        (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
                        (reg:DI 2 2)
                    ] UNSPEC_TOCREL)
                (const_int 4 [0x4])) [1 barD.3297+0 S4 A32])
        (nil)))
(insn 8 7 11 2 (set (mem/c:SI (unspec:DI [
                    (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
                    (reg:DI 2 2)
                ] UNSPEC_TOCREL) [1 fooD.3296+0 S4 A32])
        (reg:SI 120 [ barD.3297 ])) "bug.i":5:13 544 {*movsi_internal1}
     (expr_list:REG_DEAD (reg:SI 120 [ barD.3297 ])
        (nil)))

...with:

Pass 1 for finding pseudo/allocno costs

    r120: preferred GEN_OR_VSX_REGS, alternative NO_REGS, allocno
GEN_OR_VSX_REGS

  a0(r120,l0) costs: BASE_REGS:0,0 GENERAL_REGS:0,0 FLOAT_REGS:0,0
ALTIVEC_REGS:0,0 VSX_REGS:4000,4000 GEN_OR_FLOAT_REGS:8000,8000 
GEN_OR_VSX_REGS:8000,8000 LINK_REGS:12000,12000 CTR_REGS:12000,12000
LINK_OR_CTR_REGS:12000,12000 SPEC_OR_GEN_REGS:12000,12000 ALL_
REGS:36000,36000 MEM:8000,8000

[snip]
      Allocno a0r120 of GEN_OR_VSX_REGS(93) has 93 avail. regs  0 3-12 14-95,
node:  0 3-12 14-95 (confl regs =  1-2 13 96-110)

[snip]
      Pushing a0(r120,l0)(cost 0)
      Popping a0(r120,l0)  --         assign reg 32
Disposition:
    0:r120 l0    32

Looking at REG_ALLOC_ORDER, f32 is the first register we attempt to use and
since it has the same zero cost as the GPRs, we end up keeping it.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102169] powerpc64 int memory operations using FP instructions
  2021-09-02  2:44 [Bug c/102169] New: powerpc64 int memory operations using FP instructions npiggin at gmail dot com
  2021-09-28 18:25 ` [Bug target/102169] " segher at gcc dot gnu.org
  2021-09-28 19:54 ` bergner at gcc dot gnu.org
@ 2021-09-28 19:58 ` bergner at gcc dot gnu.org
  2021-09-30  2:38 ` guihaoc at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bergner at gcc dot gnu.org @ 2021-09-28 19:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102169

--- Comment #3 from Peter Bergner <bergner at gcc dot gnu.org> ---
It's interesting that VSX reg costs is 4000, but the FPR and Altivec regs are
cost zero like GPRs.  I wonder why that is.  These look like d-form memory
accesses and we don't have d-form Altivec memory ops, so I'd expect those to
have non-zero cost.

Sure, FPRs have d-form ops, but why do we attempt to assign FPRs before GPRs
again???

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102169] powerpc64 int memory operations using FP instructions
  2021-09-02  2:44 [Bug c/102169] New: powerpc64 int memory operations using FP instructions npiggin at gmail dot com
                   ` (2 preceding siblings ...)
  2021-09-28 19:58 ` bergner at gcc dot gnu.org
@ 2021-09-30  2:38 ` guihaoc at gcc dot gnu.org
  2021-10-08 22:12 ` carll at gcc dot gnu.org
  2022-01-05 18:18 ` segher at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: guihaoc at gcc dot gnu.org @ 2021-09-30  2:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102169

HaoChen Gui <guihaoc at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |guihaoc at gcc dot gnu.org

--- Comment #4 from HaoChen Gui <guihaoc at gcc dot gnu.org> ---
In this case, it picks up "GEN_OR_VSX_REGS" as FLOAT_REGS costs zero in ira
pass. There is a "d,Z" alternative pair in "*movsi_internal1" expand. When the
second operand is not a "indexed_or_indirect_operand", the reload is need. In
this case, the reload is needed when it's a d-form address and doesn't match
the 'Z'. So we should punish the reload of 'Z'. Change the alternative to '^Z'.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102169] powerpc64 int memory operations using FP instructions
  2021-09-02  2:44 [Bug c/102169] New: powerpc64 int memory operations using FP instructions npiggin at gmail dot com
                   ` (3 preceding siblings ...)
  2021-09-30  2:38 ` guihaoc at gcc dot gnu.org
@ 2021-10-08 22:12 ` carll at gcc dot gnu.org
  2022-01-05 18:18 ` segher at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: carll at gcc dot gnu.org @ 2021-10-08 22:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102169

Carl Love <carll at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |carll at gcc dot gnu.org

--- Comment #5 from Carl Love <carll at gcc dot gnu.org> ---
With no optimization, GCC generates a three instruction sequence to store to
the stack frame rather than using a single stw instruction.

Adding this for informational purposes per Segher's request.

The test program

#include <stdio.h>
#include <stdlib.h>

int marker2 (int a) { return (1); }

int
main (int argc, char **argv, char **envp)
{
    marker2 (43);
    return argc;
}

Generates the three instruction sequence in both the main and marker2 function.
 Just cutting and pasting the marker2 code for P9 and P10.

On Power 9

carll@marlin:~$ gcc -g stfiwx-bug.c -o stfiwx-bug
carll@marlin:~$ objdump -S -d stfiwx-bug > stfiwx-bug.dump
carll@marlin:~$ which gcc
/usr/bin/gcc
carll@marlin:~$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

int marker2 (int a) { return (1); }
 7bc:   f8 ff e1 fb     std     r31,-8(r1)
 7c0:   c1 ff 21 f8     stdu    r1,-64(r1)
 7c4:   78 0b 3f 7c     mr      r31,r1
 7c8:   78 1b 69 7c     mr      r9,r3
 7cc:   2c 00 3f 91     stw     r9,44(r31)             << here
 7d0:   01 00 20 39     li      r9,1
 7d4:   78 4b 23 7d     mr      r3,r9
 7d8:   40 00 3f 38     addi    r1,r31,64
 7dc:   f8 ff e1 eb     ld      r31,-8(r1)
 7e0:   20 00 80 4e     blr


on Power 10

carll@ltcd97-lp1:~$ gcc -g stfiwx-bug.c -o stfiwx-bug
carll@ltcd97-lp1:~$ objdump -S -d stfiwx-bug > stfiwx-bug.dump
carll@ltcd97-lp1:~$ which gcc
/usr/bin/gcc
carll@ltcd97-lp1:~$ gcc --version
gcc (Ubuntu 10.3.0-1ubuntu1) 10.3.0

int marker2 (int a) { return (1); }
 7bc:   f8 ff e1 fb     std     r31,-8(r1)
 7c0:   c1 ff 21 f8     stdu    r1,-64(r1)
 7c4:   78 0b 3f 7c     mr      r31,r1
 7c8:   78 1b 69 7c     mr      r9,r3
 7cc:   e6 01 09 7c     mtfprwz f0,r9           << here
 7d0:   2c 00 3f 39     addi    r9,r31,44       << here
 7d4:   ae 4f 00 7c     stfiwx  f0,0,r9         << here
 7d8:   01 00 20 39     li      r9,1
 7dc:   78 4b 23 7d     mr      r3,r9
 7e0:   40 00 3f 38     addi    r1,r31,64
 7e4:   f8 ff e1 eb     ld      r31,-8(r1)
 7e8:   20 00 80 4e     blr

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/102169] powerpc64 int memory operations using FP instructions
  2021-09-02  2:44 [Bug c/102169] New: powerpc64 int memory operations using FP instructions npiggin at gmail dot com
                   ` (4 preceding siblings ...)
  2021-10-08 22:12 ` carll at gcc dot gnu.org
@ 2022-01-05 18:18 ` segher at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: segher at gcc dot gnu.org @ 2022-01-05 18:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102169

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|NEW                         |RESOLVED

--- Comment #6 from Segher Boessenkool <segher at gcc dot gnu.org> ---
Proposed patch in PR103197, doing similar to what Hao Chen says in comment 4.

*** This bug has been marked as a duplicate of bug 103197 ***

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-01-05 18:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-02  2:44 [Bug c/102169] New: powerpc64 int memory operations using FP instructions npiggin at gmail dot com
2021-09-28 18:25 ` [Bug target/102169] " segher at gcc dot gnu.org
2021-09-28 19:54 ` bergner at gcc dot gnu.org
2021-09-28 19:58 ` bergner at gcc dot gnu.org
2021-09-30  2:38 ` guihaoc at gcc dot gnu.org
2021-10-08 22:12 ` carll at gcc dot gnu.org
2022-01-05 18:18 ` segher at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).