public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* LRA for avr: help with FP and elimination
@ 2023-07-13  9:27 SenthilKumar.Selvaraj
  2023-07-14 13:29 ` Vladimir Makarov
  0 siblings, 1 reply; 8+ messages in thread
From: SenthilKumar.Selvaraj @ 2023-07-13  9:27 UTC (permalink / raw)
  To: gcc

Hi,

  I've been spending some (spare) time checking what it would take to
  make LRA work for the avr target.

  Right after I removed the TARGET_LRA_P hook disabling LRA, building
  libgcc failed with a weird ICE. On the avr, the stack pointer (SP)
  is not used to access stack slots - TARGET_CAN_ELIMINATE returns false
  if frame_pointer_needed, and TARGET_FRAME_POINTER_REQUIRED returns true
  if get_frame_size() > 0.

  With LRA, however, reload generates

(insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
                (const_int 1 [0x1])) [2 %sfp+1 S1 A8])
        (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split}
     (nil))

  and the backend code errors out when it finds SP is being used as a
  pointer register.

  Digging through the RTL dumps, I found the following. For the
  following insn sequence in *.ira

(insn 189 128 159 7 (set (reg:HI 58 [ b ])
        (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
     (nil))
(insn 159 189 160 7 (set (subreg:QI (reg:HI 58 [ b ]) 0)
        (reg:QI 86 [ a ])) "case.c":7:7 86 {movqi_insn_split}
     (nil))
(insn 160 159 32 7 (set (subreg:QI (reg:HI 58 [ b ]) 1)
        (reg:QI 87 [ a+1 ])) "case.c":7:7 86 {movqi_insn_split}
     (nil))

  1. For r58, IRA picks R28:R29, which is the frame pointer for avr.

      Popping a13(r58,l0)  --         assign reg 28

  2. LRA sees the subreg in insn 159 and generates a reload reg
  (r125).  simplify_subreg_regno (lra-constraints.cc:1810) however
  bails (returns -1) if the reg involved is FRAME_POINTER_REGNUM and
  reload isn't completed yet. LRA therefore decides rclass for the
  pseudo reg is NO_REGS.

<snip>
Creating newreg=125 from oldreg=58, assigning class NO_REGS to subreg reg r125
  159: r125:HI#0=r86:QI

  4. As rclass is NO_REGS, LRA picks an insn alternative that involves memory.
  That is my understanding, please correct me if I'm wrong.
<snip>
            0 Small class reload: reject+=3
            0 Non input pseudo reload: reject++
            Cycle danger: overall += LRA_MAX_REJECT
          alt=0,overall=610,losers=1,rld_nregs=1
            0 Small class reload: reject+=3
            0 Non input pseudo reload: reject++
            alt=1: Bad operand -- refuse
            0 Non pseudo reload: reject++
          alt=2,overall=1,losers=0,rld_nregs=0
	 Choosing alt 2 in insn 159:  (0) Qm  (1) rY00 {movqi_insn_split}

  5. LRA creates stack slots, and then uses the FP register to access
  the slots. This is despite r58 already being assigned R28:R29. 

  6. TARGET_FRAME_POINTER_REQUIRED is never called, and therefore
     frame_pointer_needed is not set, despite the creation of stack
     slots. TARGET_CAN_ELIMINATE therefore okays elimination of FP to SP,
     and this eventually causes the ICE when the avr backend sees SP being
     used as a pointer register.

  This is the relevant sequence after reload
<snip>
(insn 189 128 239 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
        (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
     (nil))
(insn 239 189 159 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
                (const_int 1 [0x1])) [2 %sfp+1 S2 A8])
        (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split}
     (nil))
(insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
                (const_int 1 [0x1])) [2 %sfp+1 S1 A8])
        (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split}
     (nil))
(insn 240 159 241 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
        (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
                (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 {*movhi_split}
     (nil))
(insn 241 240 160 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
                (const_int 1 [0x1])) [2 %sfp+1 S2 A8])
        (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split}
     (nil))
(insn 160 241 242 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__)
                (const_int 2 [0x2])) [2 %sfp+2 S1 A8])
        (reg:QI 18 r18 [orig:87 a+1 ] [87])) "case.c":7:7 86 {movqi_insn_split}
     (nil))
(insn 242 160 33 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
        (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__)
                (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 {*movhi_split}
     (nil))

  For choices other than FP, simplify_subreg_regno returns the correct part
  of the wider HImode reg, so rclass is not NO_REGS, and things workout fine.

  I checked what classic reload does in the same situation - it picks a
  different register (R25) instead of spilling to a stack slot.

<snip>
(insn 189 128 159 7 (set (reg:HI 28 r28 [orig:58 b ] [58])
        (const_int 0 [0])) "case.c":7:7 101 {*movhi_split}
     (nil))
(insn 159 189 226 7 (set (reg:QI 25 r25)
        (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split}
     (nil))
(insn 226 159 160 7 (set (reg:QI 28 r28)
        (reg:QI 25 r25)) "case.c":7:7 86 {movqi_insn_split}
     (nil))
(insn 160 226 227 7 (set (reg:QI 25 r25)
        (reg:QI 18 r18 [orig:87 a+1 ] [87])) "case.c":7:7 86 {movqi_insn_split}
     (nil))
(insn 227 160 33 7 (set (reg:QI 29 r29)
        (reg:QI 25 r25)) "case.c":7:7 86 {movqi_insn_split}
     (nil))


  My questions:

  1. Is there something obvious the avr backend is doing wrong that is
  causing this?

  2. Shouldn't LRA ask the backend for frame_pointer_required_p and
  update frame_pointer_needed if it creates stack slots?

  3. Even if (2) works, I see that lra-eliminates.cc:update_reg_eliminate
  asserts that if the backend said elimination to SP is
ok first up, it
  cannot reject that elimination later (line 1165). If the only reason
  FP is required is because LRA created stack
slots, what should the backend do?
  
  4. When simplify_subreg_regno bails for FP, lra-constraints.cc:1815
  sets rclass = NO_REGS and forces a spill to memory. The comment says
  it is to prevent infinite looping, but for this case, doesn't it
  make sense to look for other regs?
  
  5. I can work around the problem by disabling elimination from FP to SP
  when lra_in_progress, but I think it pevents IRA/LRA from using
  R28:R29 even when FP is not required at all?

  6. Basic question, but does FP to SP elimination mean any operation
  possible with FP should be doable with SP as well?


Regards
Senthil

  

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-08-11 16:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-13  9:27 LRA for avr: help with FP and elimination SenthilKumar.Selvaraj
2023-07-14 13:29 ` Vladimir Makarov
2023-07-17  7:17   ` SenthilKumar.Selvaraj
2023-07-18 15:04     ` Vladimir Makarov
2023-08-10 11:33       ` SenthilKumar.Selvaraj
2023-08-11 16:02         ` Vladimir Makarov
2023-07-27 11:50   ` Maciej W. Rozycki
2023-07-27 13:03     ` Paul Koning

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).