public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Tweak IRA handling of tying and earlyclobbers
@ 2019-06-21 13:38 Richard Sandiford
  2019-06-21 13:40 ` [PATCH 1/5] Use alternative_mask for add_insn_allocno_copies Richard Sandiford
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Richard Sandiford @ 2019-06-21 13:38 UTC (permalink / raw)
  To: gcc-patches

This series of patches tweaks the IRA handling of matched constraints
and earlyclobbers.  The main explanations are in the individual patches.

Tested on aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu.

I also tried building at least one target per CPU directory and
comparing the effect of the patches on the assembly output for
gcc.c-torture, gcc.dg and g++.dg using -O2 -ftree-vectorize.  The table
below summarises the effect on the number of lines of assembly, ignoring
tests for which the number of lines was the same:

Target                 Tests  Delta   Best  Worst Median
======                 =====  =====   ====  ===== ======
alpha-linux-gnu           87   -126    -96    138     -1
arm-linux-gnueabi         38    -37    -10      4     -1
arm-linux-gnueabihf       38    -37    -10      4     -1
avr-elf                   19    -64    -60     14     -1
bfin-elf                 143    -55    -21     21     -1
c6x-elf                   38    -32     -9     16     -1
cris-elf                 253  -1456   -192     24     -1
csky-elf                 101   -221    -36     26     -1
frv-linux-gnu             11    -23     -8     -1     -1
ft32-elf                   1     -2     -2     -2     -2
hppa64-hp-hpux11.23       66    -24    -12     12     -1
i686-apple-darwin         22    -45    -24     11     -1
i686-pc-linux-gnu         18    -65    -96     40     -1
ia64-linux-gnu             1     -4     -4     -4     -4
m68k-linux-gnu            83     31    -70     18      1
mcore-elf                 26   -122    -38     11     -2
mmix                      29   -110    -25      3     -1
mn10300-elf              399    258    -70     70      1
msp430-elf               120   1363    -13    833      2
pdp11                     37    -90    -92     25     -1
powerpc-ibm-aix7.0        31    -25     -4      3     -1
powerpc64-linux-gnu       31    -26     -2      2     -1
powerpc64le-linux-gnu     31    -26     -2      2     -1
pru-elf                    2      8      1      7      1
riscv32-elf                1     -2     -2     -2     -2
riscv64-elf                1     -2     -2     -2     -2
rl78-elf                   6    -20    -18      9     -3
rx-elf                   123     32    -58     30     -1
s390-linux-gnu             7     16     -6      9      1
s390x-linux-gnu            1     -3     -3     -3     -3
sh-linux-gnu             475  -4696   -843     42     -1
spu-elf                  168   -296   -114     25     -2
visium-elf               214   -936   -183     22     -1
x86_64-darwin             30    -25     -4      2     -1
x86_64-linux-gnu          28    -29     -4      1     -1

Of course, the number of lines is only a very rough guide to code size
and code size is only a very rough guide to performance.  It's just
a way of getting a feel for how invasive the change is in pracitce.

As often with this kind of comparison, quite a few changes in either
direction come from things that the RA doesn't consider, such as the
ability to merge code after RA.

The msp430-elf results are especially misleading.  The port has patterns
like:

;; Alternatives 2 and 3 are to handle cases generated by reload.
(define_insn "subqi3"
  [(set (match_operand:QI           0 "nonimmediate_operand" "=rYs,  rm,  &?r, ?&r")
	(minus:QI (match_operand:QI 1 "general_operand"       "0,    0,    !r,  !i")
		  (match_operand:QI 2 "general_operand"      " riYs, rmi, rmi,   r")))]
  ""
  "@
  SUB.B\t%2, %0
  SUB%X0.B\t%2, %0
  MOV%X0.B\t%1, %0 { SUB%X0.B\t%2, %0
  MOV%X0.B\t%1, %0 { SUB%X0.B\t%2, %0"
)

The patches make more use of the first two (cheap) alternatives
in preference to the third, but sometimes at the cost of introducing
moves elsewhere.  Each alternative counts one line in this test,
but the third alternative is really two instructions.

(If the port does actually want us to prefer the third alternative
over introducing moves, then I think the constraints need to be
changed.  Using "!" heavily disparages the alternative and so
it's reasonable for the optimisers to try hard to avoid it.
If the alternative is actually the preferred way of handling
untied operands then the "?" on operand 0 should be enough.)

The arm-* improvements come from patterns like:

(define_insn_and_split "*negdi2_insn"
  [(set (match_operand:DI         0 "s_register_operand" "=r,&r")
	(neg:DI (match_operand:DI 1 "s_register_operand"  "0,r")))
   (clobber (reg:CC CC_REGNUM))]
  "TARGET_32BIT"

The patches make IRA assign a saving of one full move to ties between
operands 0 and 1, whereas previously it would only assign a saving
of an eigth of a move.

The other big winners (e.g. cris-*, sh-* and visium-*) have similar cases.

I'll post the SVE patches that rely on and test for this later.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-07-01  9:01 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-21 13:38 [PATCH 0/5] Tweak IRA handling of tying and earlyclobbers Richard Sandiford
2019-06-21 13:40 ` [PATCH 1/5] Use alternative_mask for add_insn_allocno_copies Richard Sandiford
2019-06-28 11:46   ` Richard Sandiford
2019-06-28 14:17     ` Vladimir Makarov
2019-06-21 13:41 ` [PATCH 2/5] Simplify ira_setup_alts Richard Sandiford
2019-06-24 14:30   ` Vladimir Makarov
2019-06-21 13:42 ` [PATCH 4/5] Allow earlyclobbers in ira_get_dup_out_num Richard Sandiford
2019-06-24 14:32   ` Vladimir Makarov
2019-06-21 13:42 ` [PATCH 3/5] Make ira_get_dup_out_num handle more cases Richard Sandiford
2019-06-24 14:32   ` Vladimir Makarov
2019-06-21 13:43 ` [PATCH 5/5] Use ira_setup_alts for conflict detection Richard Sandiford
2019-06-24 14:33   ` Vladimir Makarov
2019-07-01  9:01     ` Richard Sandiford
2019-06-21 17:43 ` [PATCH 0/5] Tweak IRA handling of tying and earlyclobbers Richard Sandiford
2019-06-24  7:54   ` Eric Botcazou
2019-06-24  8:06     ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).