From mboxrd@z Thu Jan 1 00:00:00 1970 From: chris@lslsun.epfl.ch (Christian Iseli) To: crux@Pool.Informatik.RWTH-Aachen.DE, meissner@cygnus.com Cc: law@cygnus.com, rth@cygnus.com, pcg@goof.com, egcs@cygnus.com Subject: Re: question regarding asm() Date: Tue, 28 Oct 1997 08:31:00 -0000 Message-id: <199710281607.RAA24080@lslsun17.epfl.ch> X-SW-Source: 1997-10/msg01186.html > Bernd Schmidt writes: > | > Yup. I wouldn't be suprised if this ends up being similar to the problems > | > with passing args in registers for SMALL_REGISTER_CLASS machines. > | > | I'd like to mention here that the reload patch I posted here a few months ago > | does not only cause better code to be generated on the x86, it should also > | solve these problems by taking hard register lifetimes into account and only > | using free hard regs for reloads. > > I hoped to start looking at this. I am concerned about breaking the > Linux kernel due to changing semantics. Have you considered changing > it so it still has the same semantics regarding input/output reloads? *RAMBLING MODE ON* I tried the patch and it seems to solve many problems for me. I'm configuring GCC on a small embedded 8-bit microprocessor named CoolRISC (< http://www.coolrisc.com/ > in case you want more info... :-) We will be transfering the copyright of the whole thing to the FSF shortly, and send the source for inclusion in the GCC and/or EGCS distributions, but in case anybody wants a sneak preview, I can probably arrange that... Anyway, I've come across quite a few stumbling blocks while developing this back-end, and one of the major pain I have is with reload... The first thing to say is that registers are scarce. 16 8-bit registers, of which 1 is unusable (status register) and 1 is an accumulator which gets clobbered by most operations. Adresses use 16 bits, and thus 2 registers are needed to hold one address. 4 groups of 2 registers can be used to reference memory. One of those group is used as the stack pointer, leaving 3 free groups, except when a frame pointer is needed... Needless to say, a program compiled without -O (and thus needing a frame pointer) has little chance of success. So I have seen a lot of messages "Cannot find a register to spill"... The usuall scenario goes thusly; we have a bunch of insns looking like so: 1 (set (reg A) something) 2 (set (reg B) (mem (plus (reg A) const))) 3 (set (reg C) something) 4 (set (reg D) (mem (plus (reg C) const))) 5 (set (mem (plus (reg B) const)) (some_op (mem (plus (reg B) const)) (mem (plus (reg D) const))) Local-alloc does its thing and is able to allocate A and B to hard register P1 and C and D to hard register P2. Now reload starts and pretty soon discovers that it has to spill hard register P1 to use it as a reload reg. Things now look like this: 1 (set (reg A) something) 2 (set (reg B) (mem (plus (reg A) const))) 3 (set (reg P2) something) 4 (set (reg P2) (mem (plus (reg P2) const))) 5 (set (mem (plus (reg B) const)) (some_op (mem (plus (reg B) const)) (mem (plus (reg P2) const))) Then, it comes back to insn 5 and calls find_reloads. The conclusion of find_reloads is that it will need four address registers (of class ADDR_REGS), plus 2 optional reloads of the mem parameters. Strangely enough, all reloads end up being some form of input* reloads. Since at most 3 address registers can be found, reload dies miserably. Of course, by carefully looking at the RTL, it is clear that (in this example) 1 address registers would be enough. I'm not yet sure how to teach reload to solve this problem. In the meantime, I tried Bernd's patch and it seems to solve (or avoid) partly the problem. The resulting compiler produced slightly better code, but died a few times with the "forbidden register was spilled" message. Most of the time, GCC was trying to spill the (unneeded) frame pointer. I applied a small patch to tell reload that it's OK to spill the FP when it is not needed and now I'm left with only a few cases where reload dies while trying to spill a forbidden (pseudo) register. BTW, I use the PlumHall validation suite to test the compiler... When I have a little time, I'll try to use the GCC test suite. *RAMBLING MODE OFF* Bernd, does your patch try to address my reload problem, or is it merely a side effect? BTW, your patch does not apply cleanly on egcs-971023, the last hunk of local-alloc.c fails. I think the hunk below is correct though... Seems I used enough bandwisth already... Anybody care to comment on this? Cheers, Christian *************** block_alloc (b) *** 1710,1726 **** { for (i = qty_first_reg[q]; i >= 0; i = reg_next_in_qty[i]) reg_renumber[i] = qty_phys_reg[q] + reg_offset[i]; - if (qty_scratch_rtx[q]) - { - if (GET_CODE (qty_scratch_rtx[q]) == REG) - abort (); - - qty_scratch_rtx[q] = gen_rtx (REG, GET_MODE (qty_scratch_rtx[q]), - qty_phys_reg[q]); - - scratch_block[scratch_index] = b; - scratch_list[scratch_index++] = qty_scratch_rtx[q]; - } } } --- 1583,1588 ----