public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Steven Bosscher <stevenb.gcc@gmail.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>
Cc: Vladimir Makarov <vmakarov@redhat.com>,
	David Edelsohn <dje.gcc@gmail.com>
Subject: LRA vs reload on powerpc: 2 extra FAILs that are actually improvements?
Date: Sat, 02 Nov 2013 22:49:00 -0000	[thread overview]
Message-ID: <CABu31nPyoqz26Cwmzq2PhFL+oezacA0CYoLy8vTOUhj07=CgRg@mail.gmail.com> (raw)

Hello,

Today's powerpc64-linux gcc has 2 extra failures with -mlra vs. reload
(i.e. svn unpatched).

(I'm excluding guality failure differences here because there are too
many of them that seem to fail at random after minimal changes
anywhere in the compiler...).

Test results are posted here:
reload: http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg00128.html
lra: http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg00129.html

The new failures and total score is as follows (+=lra, -=reload):
+FAIL: gcc.target/powerpc/pr53199.c scan-assembler-times stwbrx 6
+FAIL: gcc.target/powerpc/pr58330.c scan-assembler-not stwbrx

                === gcc Summary ===

-# of expected passes           97887
-# of unexpected failures       536
+# of expected passes           97903
+# of unexpected failures       538
 # of unexpected successes      38
 # of expected failures         244
-# of unsupported tests         1910
+# of unsupported tests         1892


The failure of pr53199.c is because of different instruction selection
for bswap. Test case is reduced to just one function:

/* { dg-options "-O2 -mcpu=power6 -mavoid-indexed-addresses" } */
long long
reg_reverse (long long x)
{
  return __builtin_bswap64 (x);
}

Reload left vs. LRA right:
reg_reverse:                           reg_reverse:
        srdi 8,3,32                  |         addi 8,1,-16
        rlwinm 7,3,8,0xffffffff      |         srdi 10,3,32
        rlwinm 9,8,8,0xffffffff      |         addi 9,8,4
        rlwimi 7,3,24,0,7            |         stwbrx 3,0,8
        rlwimi 7,3,24,16,23          |         stwbrx 10,0,9
        rlwimi 9,8,24,0,7            |         ld 3,-16(1)
        rlwimi 9,8,24,16,23          <
        sldi 7,7,32                  <
        or 7,7,9                     <
        mr 3,7                       <
        blr                                    blr

This same difference is responsible for the failure of pr58330.c which
also uses __builtin_bswap64().

The difference in RTL for the test case is this (after reload vs. after LRA):
-   11: {%7:DI=bswap(%3:DI);clobber %8:DI;clobber %9:DI;clobber %10:DI;}
-   20: %3:DI=%7:DI
+   20: %8:DI=%1:DI-0x10
+   21: %8:DI=%8:DI  // stupid no-op move
+   11: {[%8:DI]=bswap(%3:DI);clobber %9:DI;clobber %10:DI;clobber scratch;}
+   19: %3:DI=[%1:DI-0x10]

So LRA believes going through memory is better than using a register,
even though obviously there are plenty registers available.

What LRA does:
      Creating newreg=129
Removing SCRATCH in insn #11 (nop 2)
      Creating newreg=130
Removing SCRATCH in insn #11 (nop 3)
      Creating newreg=131
Removing SCRATCH in insn #11 (nop 4)
// at this point the insn would be a bswapdi2_64bit:
//   11: {%3:DI=bswap(%3:DI);clobber r129;clobber r130;clobber r131;}
// cost calculation for the insn alternatives:
            0 Early clobber: reject++
            1 Non-pseudo reload: reject+=2
            1 Spill pseudo in memory: reject+=3
            2 Scratch win: reject+=2
            3 Scratch win: reject+=2
            4 Scratch win: reject+=2
          alt=0,overall=18,losers=1,rld_nregs=0
            0 Non-pseudo reload: reject+=2
            0 Spill pseudo in memory: reject+=3
            0 Non input pseudo reload: reject++
            2 Scratch win: reject+=2
            3 Scratch win: reject+=2
          alt=1,overall=16,losers=1,rld_nregs=0
            Staticly defined alt reject+=12
            0 Early clobber: reject++
            2 Scratch win: reject+=2
            3 Scratch win: reject+=2
            4 Scratch win: reject+=2
            0 Conflict early clobber reload: reject--
          alt=2,overall=24,losers=1,rld_nregs=0
         Choosing alt 1 in insn 11:  (0) Z  (1) r  (2) &b  (3) &r  (4)
X {*bswapdi2_64bit}
      Change to class BASE_REGS for r129
      Change to class GENERAL_REGS for r130
      Creating newreg=132 from oldreg=3, assigning class NO_REGS to r132
      Change to class NO_REGS for r131
   11: {r132:DI=bswap(%3:DI);clobber r129:DI;clobber r130:DI;clobber r131:DI;}
      REG_UNUSED r131:DI
      REG_UNUSED r130:DI
      REG_UNUSED r129:DI

LRA selects alternative 1 (Z,r,&b,&r,X) which seems to be the right
choice, from looking at the constraints. Reload selects alternative 2
which is slightly^2 discouraged: (??&r,r,&r,&r,&r).

Is this an improvement or a regression? If it's an improvement then
these two test cases should be adjusted :-)

Ciao!
Steven

             reply	other threads:[~2013-11-02 22:49 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-02 22:49 Steven Bosscher [this message]
2013-11-04 14:16 ` David Edelsohn
2013-12-01  5:55   ` Alan Modra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABu31nPyoqz26Cwmzq2PhFL+oezacA0CYoLy8vTOUhj07=CgRg@mail.gmail.com' \
    --to=stevenb.gcc@gmail.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=vmakarov@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).