public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* LRA for avr: Handling hard regs set directly at expand
@ 2023-07-17 11:33 SenthilKumar.Selvaraj
  2023-07-27 13:11 ` Georg-Johann Lay
  2023-08-02 16:54 ` Vladimir Makarov
  0 siblings, 2 replies; 6+ messages in thread
From: SenthilKumar.Selvaraj @ 2023-07-17 11:33 UTC (permalink / raw)
  To: gcc; +Cc: vmakarov

Hi,

  The avr target has a bunch of patterns that directly set hard regs at expand time, like so

(define_expand "cpymemhi"
  [(parallel [(set (match_operand:BLK 0 "memory_operand" "")
                   (match_operand:BLK 1 "memory_operand" ""))
              (use (match_operand:HI 2 "const_int_operand" ""))
              (use (match_operand:HI 3 "const_int_operand" ""))])]
  ""
  {
    if (avr_emit_cpymemhi (operands))
      DONE;

    FAIL;
  })

where avr_emit_cpymemhi generates

(insn 14 13 15 4 (set (reg:HI 30 r30)
        (reg:HI 48 [ ivtmp.10 ])) "pr53505.c":21:22 -1
     (nil))
(insn 15 14 16 4 (set (reg:HI 26 r26)
        (reg/f:HI 38 virtual-stack-vars)) "pr53505.c":21:22 -1
     (nil))
(insn 16 15 17 4 (parallel [
            (set (mem:BLK (reg:HI 26 r26) [0  A8])
                (mem:BLK (reg:HI 30 r30) [0  A8]))
            (unspec [
                    (const_int 0 [0])
                ] UNSPEC_CPYMEM)
            (use (reg:QI 52))
            (clobber (reg:HI 26 r26))
            (clobber (reg:HI 30 r30))
            (clobber (reg:QI 0 r0))
            (clobber (reg:QI 52))
        ]) "pr53505.c":21:22 -1
     (nil))

Classic reload knows about these - find_reg masks out bad_spill_regs, and bad_spill_regs
when ORed with chain->live_throughout in order_regs_for_reload picks up r30.

LRA, however, appears to not consider that, and proceeds to use such regs as reload regs.
For the same source, it generates
<snip>
 Choosing alt 0 in insn 15:  (0) =r  (1) r {*movhi_split}
      Creating newreg=70, assigning class GENERAL_REGS to r70
   15: r26:HI=r70:HI
      REG_EQUAL r28:HI+0x1
    Inserting insn reload before:
   58: r70:HI=r28:HI+0x1

Choosing alt 3 in insn 58:  (0) d  (1) 0  (2) nYnn {*addhi3_split}
      Creating newreg=71 from oldreg=70, assigning class LD_REGS to r71
   58: r71:HI=r71:HI+0x1
    Inserting insn reload before:
   59: r71:HI=r28:HI
    Inserting insn reload after:
   60: r70:HI=r71:HI

********** Assignment #1: **********

	 Assigning to 71 (cl=LD_REGS, orig=70, freq=3000, tfirst=71, tfreq=3000)...
	   Assign 30 to reload r71 (freq=3000)
	Hard reg 26 is preferable by r70 with profit 1000
	Hard reg 30 is preferable by r70 with profit 1000
	 Assigning to 70 (cl=GENERAL_REGS, orig=70, freq=2000, tfirst=70, tfreq=2000)...
	   Assign 30 to reload r70 (freq=2000)


(insn 14 13 59 3 (set (reg:HI 30 r30)
        (reg:HI 18 r18 [orig:48 ivtmp.10 ] [48])) "pr53505.c":21:22 101 {*movhi_split}
     (nil))
(insn 59 14 58 3 (set (reg:HI 30 r30 [70])
        (reg/f:HI 28 r28)) "pr53505.c":21:22 101 {*movhi_split}
     (nil))
(insn 58 59 15 3 (set (reg:HI 30 r30 [70])
        (plus:HI (reg:HI 30 r30 [70])
            (const_int 1 [0x1]))) "pr53505.c":21:22 165 {*addhi3_split}
     (nil))
(insn 15 58 16 3 (set (reg:HI 26 r26)
        (reg:HI 30 r30 [70])) "pr53505.c":21:22 101 {*movhi_split}
     (expr_list:REG_EQUAL (plus:HI (reg/f:HI 28 r28)
            (const_int 1 [0x1]))
        (nil)))
(insn 16 15 17 3 (parallel [
            (set (mem:BLK (reg:HI 26 r26) [0  A8])
                (mem:BLK (reg:HI 30 r30) [0  A8]))
            (unspec [
                    (const_int 0 [0])
                ] UNSPEC_CPYMEM)
            (use (reg:QI 22 r22 [52]))
            (clobber (reg:HI 26 r26))
            (clobber (reg:HI 30 r30))
            (clobber (reg:QI 0 r0))
            (clobber (reg:QI 22 r22 [52]))
        ]) "pr53505.c":21:22 132 {cpymem_qi}
     (nil))

LRA generates insn 59 that clobbers r30 set in insn 14, causing an execution
failure down the line.

How should the avr backend deal with this?

Regards
Senthil



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: LRA for avr: Handling hard regs set directly at expand
  2023-07-17 11:33 LRA for avr: Handling hard regs set directly at expand SenthilKumar.Selvaraj
@ 2023-07-27 13:11 ` Georg-Johann Lay
  2023-07-28  5:04   ` SenthilKumar.Selvaraj
  2023-08-02 16:54 ` Vladimir Makarov
  1 sibling, 1 reply; 6+ messages in thread
From: Georg-Johann Lay @ 2023-07-27 13:11 UTC (permalink / raw)
  To: SenthilKumar.Selvaraj, gcc; +Cc: vmakarov



Am 17.07.23 um 13:33 schrieb SenthilKumar.Selvaraj--- via Gcc:
> Hi,
> 
>    The avr target has a bunch of patterns that directly set hard regs at expand time, like so

The correct approach would be to use usual predicates together with 
constraints that describe the register instead of hard regs, e.g. 
(match_operand:HI n "register_operand" "R18_2") for a 2-byte register 
that starts at R18 instead of (reg:HI 18).  I deprecated and removed 
constraints starting with "R" long ago in order to get "R" free for that 
purpose.

Some years ago I tried such constraints (and hence also zoo of new 
register classes that are required to accommodate them).  The resulting 
code quality was so bad that I quickly abandoned that approach, and IIRC 
there were also spill fails.  Appears that reload / ira was overwhelmed 
by the multitude of new reg classes and took sub-optimal decisions.

The way out was more of explicit hard regs in expand, together with 
awkward functionalities like avr_fix_operands (PR63633) and the 
functions that use it.  That way we get correct code without performance 
penalties in unrelated places.

Most of such insns are explicitly modelling hand-written asm functions 
in libgcc, because most of these functions have a footprint smaller than 
the default ABI.  And some functions have an interface not complying to 
default ABI.

For the case of cpymem etc from below, explicit hard registers were used 
because register allocator did a bad job when using constraints like "e" 
(X, Y, or Z).

Johann


> (define_expand "cpymemhi"
>    [(parallel [(set (match_operand:BLK 0 "memory_operand" "")
>                     (match_operand:BLK 1 "memory_operand" ""))
>                (use (match_operand:HI 2 "const_int_operand" ""))
>                (use (match_operand:HI 3 "const_int_operand" ""))])]
>    ""
>    {
>      if (avr_emit_cpymemhi (operands))
>        DONE;
> 
>      FAIL;
>    })
> 
> where avr_emit_cpymemhi generates
> 
> (insn 14 13 15 4 (set (reg:HI 30 r30)
>          (reg:HI 48 [ ivtmp.10 ])) "pr53505.c":21:22 -1
>       (nil))
> (insn 15 14 16 4 (set (reg:HI 26 r26)
>          (reg/f:HI 38 virtual-stack-vars)) "pr53505.c":21:22 -1
>       (nil))
> (insn 16 15 17 4 (parallel [
>              (set (mem:BLK (reg:HI 26 r26) [0  A8])
>                  (mem:BLK (reg:HI 30 r30) [0  A8]))
>              (unspec [
>                      (const_int 0 [0])
>                  ] UNSPEC_CPYMEM)
>              (use (reg:QI 52))
>              (clobber (reg:HI 26 r26))
>              (clobber (reg:HI 30 r30))
>              (clobber (reg:QI 0 r0))
>              (clobber (reg:QI 52))
>          ]) "pr53505.c":21:22 -1
>       (nil))
> 
> Classic reload knows about these - find_reg masks out bad_spill_regs, and bad_spill_regs
> when ORed with chain->live_throughout in order_regs_for_reload picks up r30.
> 
> LRA, however, appears to not consider that, and proceeds to use such regs as reload regs.
> For the same source, it generates
> <snip>
>   Choosing alt 0 in insn 15:  (0) =r  (1) r {*movhi_split}
>        Creating newreg=70, assigning class GENERAL_REGS to r70
>     15: r26:HI=r70:HI
>        REG_EQUAL r28:HI+0x1
>      Inserting insn reload before:
>     58: r70:HI=r28:HI+0x1
> 
> Choosing alt 3 in insn 58:  (0) d  (1) 0  (2) nYnn {*addhi3_split}
>        Creating newreg=71 from oldreg=70, assigning class LD_REGS to r71
>     58: r71:HI=r71:HI+0x1
>      Inserting insn reload before:
>     59: r71:HI=r28:HI
>      Inserting insn reload after:
>     60: r70:HI=r71:HI
> 
> ********** Assignment #1: **********
> 
> 	 Assigning to 71 (cl=LD_REGS, orig=70, freq=3000, tfirst=71, tfreq=3000)...
> 	   Assign 30 to reload r71 (freq=3000)
> 	Hard reg 26 is preferable by r70 with profit 1000
> 	Hard reg 30 is preferable by r70 with profit 1000
> 	 Assigning to 70 (cl=GENERAL_REGS, orig=70, freq=2000, tfirst=70, tfreq=2000)...
> 	   Assign 30 to reload r70 (freq=2000)
> 
> 
> (insn 14 13 59 3 (set (reg:HI 30 r30)
>          (reg:HI 18 r18 [orig:48 ivtmp.10 ] [48])) "pr53505.c":21:22 101 {*movhi_split}
>       (nil))
> (insn 59 14 58 3 (set (reg:HI 30 r30 [70])
>          (reg/f:HI 28 r28)) "pr53505.c":21:22 101 {*movhi_split}
>       (nil))
> (insn 58 59 15 3 (set (reg:HI 30 r30 [70])
>          (plus:HI (reg:HI 30 r30 [70])
>              (const_int 1 [0x1]))) "pr53505.c":21:22 165 {*addhi3_split}
>       (nil))
> (insn 15 58 16 3 (set (reg:HI 26 r26)
>          (reg:HI 30 r30 [70])) "pr53505.c":21:22 101 {*movhi_split}
>       (expr_list:REG_EQUAL (plus:HI (reg/f:HI 28 r28)
>              (const_int 1 [0x1]))
>          (nil)))
> (insn 16 15 17 3 (parallel [
>              (set (mem:BLK (reg:HI 26 r26) [0  A8])
>                  (mem:BLK (reg:HI 30 r30) [0  A8]))
>              (unspec [
>                      (const_int 0 [0])
>                  ] UNSPEC_CPYMEM)
>              (use (reg:QI 22 r22 [52]))
>              (clobber (reg:HI 26 r26))
>              (clobber (reg:HI 30 r30))
>              (clobber (reg:QI 0 r0))
>              (clobber (reg:QI 22 r22 [52]))
>          ]) "pr53505.c":21:22 132 {cpymem_qi}
>       (nil))
> 
> LRA generates insn 59 that clobbers r30 set in insn 14, causing an execution
> failure down the line.
> 
> How should the avr backend deal with this?
> 
> Regards
> Senthil
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: LRA for avr: Handling hard regs set directly at expand
  2023-07-27 13:11 ` Georg-Johann Lay
@ 2023-07-28  5:04   ` SenthilKumar.Selvaraj
  2023-07-28  8:11     ` Georg-Johann Lay
  0 siblings, 1 reply; 6+ messages in thread
From: SenthilKumar.Selvaraj @ 2023-07-28  5:04 UTC (permalink / raw)
  To: gcc, avr; +Cc: vmakarov

On Thu, 2023-07-27 at 15:11 +0200, Georg-Johann Lay wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> Am 17.07.23 um 13:33 schrieb SenthilKumar.Selvaraj--- via Gcc:
> > Hi,
> > 
> >    The avr target has a bunch of patterns that directly set hard regs at expand time, like so
> 
> The correct approach would be to use usual predicates together with
> constraints that describe the register instead of hard regs, e.g.
> (match_operand:HI n "register_operand" "R18_2") for a 2-byte register
> that starts at R18 instead of (reg:HI 18).  I deprecated and removed
> constraints starting with "R" long ago in order to get "R" free for that
> purpose.
> 
> Some years ago I tried such constraints (and hence also zoo of new
> register classes that are required to accommodate them).  The resulting
> code quality was so bad that I quickly abandoned that approach, and IIRC
> there were also spill fails.  Appears that reload / ira was overwhelmed
> by the multitude of new reg classes and took sub-optimal decisions.
> 
> The way out was more of explicit hard regs in expand, together with
> awkward functionalities like avr_fix_operands (PR63633) and the
> functions that use it.  That way we get correct code without performance
> penalties in unrelated places.
> 
> Most of such insns are explicitly modelling hand-written asm functions
> in libgcc, because most of these functions have a footprint smaller than
> the default ABI.  And some functions have an interface not complying to
> default ABI.
> 
> For the case of cpymem etc from below, explicit hard registers were used
> because register allocator did a bad job when using constraints like "e"
> (X, Y, or Z).

I guessed that much. Yes, using constraints works - I used "x" and "z" that
directly correspond to REG_X and REG_Z (ignore the weird operand numbering).

diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md
index be0f8dcbe0e..6c6c4e4e212 100644
--- a/gcc/config/avr/avr.md
+++ b/gcc/config/avr/avr.md
@@ -1148,20 +1148,20 @@
 ;; "cpymem_qi"
 ;; "cpymem_hi"
 (define_insn_and_split "cpymem_<mode>"
-  [(set (mem:BLK (reg:HI REG_X))
-        (mem:BLK (reg:HI REG_Z)))
+  [(set (mem:BLK (match_operand:HI 3 "register_operand" "+x"))
+        (mem:BLK (match_operand:HI 4 "register_operand" "+z")))
    (unspec [(match_operand:QI 0 "const_int_operand" "n")]
            UNSPEC_CPYMEM)
    (use (match_operand:QIHI 1 "register_operand" "<CPYMEM_r_d>"))
-   (clobber (reg:HI REG_X))
-   (clobber (reg:HI REG_Z))
+   (clobber (match_dup 3))
+   (clobber (match_dup 4))
    (clobber (reg:QI LPM_REGNO))
    (clobber (match_operand:QIHI 2 "register_operand" "=1"))]
   ""
   "#"
   "&& reload_completed"
-  [(parallel [(set (mem:BLK (reg:HI REG_X))
-                   (mem:BLK (reg:HI REG_Z)))
+  [(parallel [(set (mem:BLK (match_dup 3))
+                   (mem:BLK (match_dup 4)))
               (unspec [(match_dup 0)]
                       UNSPEC_CPYMEM)
               (use (match_dup 1))

I know you did these changes a long time ago, but do you happen to have any
test cases lying around that I can use to see if LRA does a better job than
classic reload?

Vladimir, given that classic reload handled such hardcoded hard regs just 
fine, should LRA also be able to deal with them the same way? Or is this
something that LRA is not going to support?

Regards
Senthil

> 
> Johann
> 
> 
> > (define_expand "cpymemhi"
> >    [(parallel [(set (match_operand:BLK 0 "memory_operand" "")
> >                     (match_operand:BLK 1 "memory_operand" ""))
> >                (use (match_operand:HI 2 "const_int_operand" ""))
> >                (use (match_operand:HI 3 "const_int_operand" ""))])]
> >    ""
> >    {
> >      if (avr_emit_cpymemhi (operands))
> >        DONE;
> > 
> >      FAIL;
> >    })
> > 
> > where avr_emit_cpymemhi generates
> > 
> > (insn 14 13 15 4 (set (reg:HI 30 r30)
> >          (reg:HI 48 [ ivtmp.10 ])) "pr53505.c":21:22 -1
> >       (nil))
> > (insn 15 14 16 4 (set (reg:HI 26 r26)
> >          (reg/f:HI 38 virtual-stack-vars)) "pr53505.c":21:22 -1
> >       (nil))
> > (insn 16 15 17 4 (parallel [
> >              (set (mem:BLK (reg:HI 26 r26) [0  A8])
> >                  (mem:BLK (reg:HI 30 r30) [0  A8]))
> >              (unspec [
> >                      (const_int 0 [0])
> >                  ] UNSPEC_CPYMEM)
> >              (use (reg:QI 52))
> >              (clobber (reg:HI 26 r26))
> >              (clobber (reg:HI 30 r30))
> >              (clobber (reg:QI 0 r0))
> >              (clobber (reg:QI 52))
> >          ]) "pr53505.c":21:22 -1
> >       (nil))
> > 
> > Classic reload knows about these - find_reg masks out bad_spill_regs, and bad_spill_regs
> > when ORed with chain->live_throughout in order_regs_for_reload picks up r30.
> > 
> > LRA, however, appears to not consider that, and proceeds to use such regs as reload regs.
> > For the same source, it generates
> > <snip>
> >   Choosing alt 0 in insn 15:  (0) =r  (1) r {*movhi_split}
> >        Creating newreg=70, assigning class GENERAL_REGS to r70
> >     15: r26:HI=r70:HI
> >        REG_EQUAL r28:HI+0x1
> >      Inserting insn reload before:
> >     58: r70:HI=r28:HI+0x1
> > 
> > Choosing alt 3 in insn 58:  (0) d  (1) 0  (2) nYnn {*addhi3_split}
> >        Creating newreg=71 from oldreg=70, assigning class LD_REGS to r71
> >     58: r71:HI=r71:HI+0x1
> >      Inserting insn reload before:
> >     59: r71:HI=r28:HI
> >      Inserting insn reload after:
> >     60: r70:HI=r71:HI
> > 
> > ********** Assignment #1: **********
> > 
> >        Assigning to 71 (cl=LD_REGS, orig=70, freq=3000, tfirst=71, tfreq=3000)...
> >          Assign 30 to reload r71 (freq=3000)
> >       Hard reg 26 is preferable by r70 with profit 1000
> >       Hard reg 30 is preferable by r70 with profit 1000
> >        Assigning to 70 (cl=GENERAL_REGS, orig=70, freq=2000, tfirst=70, tfreq=2000)...
> >          Assign 30 to reload r70 (freq=2000)
> > 
> > 
> > (insn 14 13 59 3 (set (reg:HI 30 r30)
> >          (reg:HI 18 r18 [orig:48 ivtmp.10 ] [48])) "pr53505.c":21:22 101 {*movhi_split}
> >       (nil))
> > (insn 59 14 58 3 (set (reg:HI 30 r30 [70])
> >          (reg/f:HI 28 r28)) "pr53505.c":21:22 101 {*movhi_split}
> >       (nil))
> > (insn 58 59 15 3 (set (reg:HI 30 r30 [70])
> >          (plus:HI (reg:HI 30 r30 [70])
> >              (const_int 1 [0x1]))) "pr53505.c":21:22 165 {*addhi3_split}
> >       (nil))
> > (insn 15 58 16 3 (set (reg:HI 26 r26)
> >          (reg:HI 30 r30 [70])) "pr53505.c":21:22 101 {*movhi_split}
> >       (expr_list:REG_EQUAL (plus:HI (reg/f:HI 28 r28)
> >              (const_int 1 [0x1]))
> >          (nil)))
> > (insn 16 15 17 3 (parallel [
> >              (set (mem:BLK (reg:HI 26 r26) [0  A8])
> >                  (mem:BLK (reg:HI 30 r30) [0  A8]))
> >              (unspec [
> >                      (const_int 0 [0])
> >                  ] UNSPEC_CPYMEM)
> >              (use (reg:QI 22 r22 [52]))
> >              (clobber (reg:HI 26 r26))
> >              (clobber (reg:HI 30 r30))
> >              (clobber (reg:QI 0 r0))
> >              (clobber (reg:QI 22 r22 [52]))
> >          ]) "pr53505.c":21:22 132 {cpymem_qi}
> >       (nil))
> > 
> > LRA generates insn 59 that clobbers r30 set in insn 14, causing an execution
> > failure down the line.
> > 
> > How should the avr backend deal with this?
> > 
> > Regards
> > Senthil
> > 
> > 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: LRA for avr: Handling hard regs set directly at expand
  2023-07-28  5:04   ` SenthilKumar.Selvaraj
@ 2023-07-28  8:11     ` Georg-Johann Lay
  0 siblings, 0 replies; 6+ messages in thread
From: Georg-Johann Lay @ 2023-07-28  8:11 UTC (permalink / raw)
  To: SenthilKumar.Selvaraj, gcc; +Cc: vmakarov



Am 28.07.23 um 07:04 schrieb SenthilKumar.Selvaraj@microchip.com:
> On Thu, 2023-07-27 at 15:11 +0200, Georg-Johann Lay wrote:
>> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>>
>> Am 17.07.23 um 13:33 schrieb SenthilKumar.Selvaraj--- via Gcc:
>>> Hi,
>>>
>>>     The avr target has a bunch of patterns that directly set hard regs at expand time, like so
>>
>> The correct approach would be to use usual predicates together with
>> constraints that describe the register instead of hard regs, e.g.
>> (match_operand:HI n "register_operand" "R18_2") for a 2-byte register
>> that starts at R18 instead of (reg:HI 18).  I deprecated and removed
>> constraints starting with "R" long ago in order to get "R" free for that
>> purpose.
>>
>> Some years ago I tried such constraints (and hence also zoo of new
>> register classes that are required to accommodate them).  The resulting
>> code quality was so bad that I quickly abandoned that approach, and IIRC
>> there were also spill fails.  Appears that reload / ira was overwhelmed
>> by the multitude of new reg classes and took sub-optimal decisions.
>>
>> The way out was more of explicit hard regs in expand, together with
>> awkward functionalities like avr_fix_operands (PR63633) and the
>> functions that use it.  That way we get correct code without performance
>> penalties in unrelated places.
>>
>> Most of such insns are explicitly modelling hand-written asm functions
>> in libgcc, because most of these functions have a footprint smaller than
>> the default ABI.  And some functions have an interface not complying to
>> default ABI.
>>
>> For the case of cpymem etc from below, explicit hard registers were used
>> because register allocator did a bad job when using constraints like "e"
>> (X, Y, or Z).
> 
> I guessed that much. Yes, using constraints works - I used "x" and "z" that
> directly correspond to REG_X and REG_Z (ignore the weird operand numbering).
> 
> diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md
> index be0f8dcbe0e..6c6c4e4e212 100644
> --- a/gcc/config/avr/avr.md
> +++ b/gcc/config/avr/avr.md
> @@ -1148,20 +1148,20 @@
>   ;; "cpymem_qi"
>   ;; "cpymem_hi"
>   (define_insn_and_split "cpymem_<mode>"
> -  [(set (mem:BLK (reg:HI REG_X))
> -        (mem:BLK (reg:HI REG_Z)))
> +  [(set (mem:BLK (match_operand:HI 3 "register_operand" "+x"))
> +        (mem:BLK (match_operand:HI 4 "register_operand" "+z")))
>      (unspec [(match_operand:QI 0 "const_int_operand" "n")]
>              UNSPEC_CPYMEM)
>      (use (match_operand:QIHI 1 "register_operand" "<CPYMEM_r_d>"))
> -   (clobber (reg:HI REG_X))
> -   (clobber (reg:HI REG_Z))
> +   (clobber (match_dup 3))
> +   (clobber (match_dup 4))

With some other insns I encountered problem with clobber (match_dup), 
similar to PR50788, and thus used clobber of operand with constraint as 
proposed in that PR.  See also comments for insn "*add.for.eqne.<mode>".

>      (clobber (reg:QI LPM_REGNO))
>      (clobber (match_operand:QIHI 2 "register_operand" "=1"))]
>     ""
>     "#"
>     "&& reload_completed"
> -  [(parallel [(set (mem:BLK (reg:HI REG_X))
> -                   (mem:BLK (reg:HI REG_Z)))
> +  [(parallel [(set (mem:BLK (match_dup 3))
> +                   (mem:BLK (match_dup 4)))

This split happens when reload_completed, thus using hard regs should be 
okay (match_dup will be X or Z anyway).

>                 (unspec [(match_dup 0)]
>                         UNSPEC_CPYMEM)
>                 (use (match_dup 1))
> 
> I know you did these changes a long time ago, but do you happen to have any
> test cases lying around that I can use to see if LRA does a better job than
> classic reload?

No.  But the test cases I used were straight forward lines of code.

Johann

> Vladimir, given that classic reload handled such hardcoded hard regs just
> fine, should LRA also be able to deal with them the same way? Or is this
> something that LRA is not going to support?
> 
> Regards
> Senthil

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: LRA for avr: Handling hard regs set directly at expand
  2023-07-17 11:33 LRA for avr: Handling hard regs set directly at expand SenthilKumar.Selvaraj
  2023-07-27 13:11 ` Georg-Johann Lay
@ 2023-08-02 16:54 ` Vladimir Makarov
  2023-08-03  5:53   ` SenthilKumar.Selvaraj
  1 sibling, 1 reply; 6+ messages in thread
From: Vladimir Makarov @ 2023-08-02 16:54 UTC (permalink / raw)
  To: SenthilKumar.Selvaraj, gcc


On 7/17/23 07:33, SenthilKumar.Selvaraj@microchip.com wrote:
> Hi,
>
>    The avr target has a bunch of patterns that directly set hard regs at expand time, like so
>
> (define_expand "cpymemhi"
>    [(parallel [(set (match_operand:BLK 0 "memory_operand" "")
>                     (match_operand:BLK 1 "memory_operand" ""))
>                (use (match_operand:HI 2 "const_int_operand" ""))
>                (use (match_operand:HI 3 "const_int_operand" ""))])]
>    ""
>    {
>      if (avr_emit_cpymemhi (operands))
>        DONE;
>
>      FAIL;
>    })
>
> where avr_emit_cpymemhi generates
>
> (insn 14 13 15 4 (set (reg:HI 30 r30)
>          (reg:HI 48 [ ivtmp.10 ])) "pr53505.c":21:22 -1
>       (nil))
> (insn 15 14 16 4 (set (reg:HI 26 r26)
>          (reg/f:HI 38 virtual-stack-vars)) "pr53505.c":21:22 -1
>       (nil))
> (insn 16 15 17 4 (parallel [
>              (set (mem:BLK (reg:HI 26 r26) [0  A8])
>                  (mem:BLK (reg:HI 30 r30) [0  A8]))
>              (unspec [
>                      (const_int 0 [0])
>                  ] UNSPEC_CPYMEM)
>              (use (reg:QI 52))
>              (clobber (reg:HI 26 r26))
>              (clobber (reg:HI 30 r30))
>              (clobber (reg:QI 0 r0))
>              (clobber (reg:QI 52))
>          ]) "pr53505.c":21:22 -1
>       (nil))
>
> Classic reload knows about these - find_reg masks out bad_spill_regs, and bad_spill_regs
> when ORed with chain->live_throughout in order_regs_for_reload picks up r30.
>
> LRA, however, appears to not consider that, and proceeds to use such regs as reload regs.
> For the same source, it generates
> <snip>
>   Choosing alt 0 in insn 15:  (0) =r  (1) r {*movhi_split}
>        Creating newreg=70, assigning class GENERAL_REGS to r70
>     15: r26:HI=r70:HI
>        REG_EQUAL r28:HI+0x1
>      Inserting insn reload before:
>     58: r70:HI=r28:HI+0x1
>
> Choosing alt 3 in insn 58:  (0) d  (1) 0  (2) nYnn {*addhi3_split}
>        Creating newreg=71 from oldreg=70, assigning class LD_REGS to r71
>     58: r71:HI=r71:HI+0x1
>      Inserting insn reload before:
>     59: r71:HI=r28:HI
>      Inserting insn reload after:
>     60: r70:HI=r71:HI
>
> ********** Assignment #1: **********
>
> 	 Assigning to 71 (cl=LD_REGS, orig=70, freq=3000, tfirst=71, tfreq=3000)...
> 	   Assign 30 to reload r71 (freq=3000)
> 	Hard reg 26 is preferable by r70 with profit 1000
> 	Hard reg 30 is preferable by r70 with profit 1000
> 	 Assigning to 70 (cl=GENERAL_REGS, orig=70, freq=2000, tfirst=70, tfreq=2000)...
> 	   Assign 30 to reload r70 (freq=2000)
>
>
> (insn 14 13 59 3 (set (reg:HI 30 r30)
>          (reg:HI 18 r18 [orig:48 ivtmp.10 ] [48])) "pr53505.c":21:22 101 {*movhi_split}
>       (nil))
> (insn 59 14 58 3 (set (reg:HI 30 r30 [70])
>          (reg/f:HI 28 r28)) "pr53505.c":21:22 101 {*movhi_split}
>       (nil))
> (insn 58 59 15 3 (set (reg:HI 30 r30 [70])
>          (plus:HI (reg:HI 30 r30 [70])
>              (const_int 1 [0x1]))) "pr53505.c":21:22 165 {*addhi3_split}
>       (nil))
> (insn 15 58 16 3 (set (reg:HI 26 r26)
>          (reg:HI 30 r30 [70])) "pr53505.c":21:22 101 {*movhi_split}
>       (expr_list:REG_EQUAL (plus:HI (reg/f:HI 28 r28)
>              (const_int 1 [0x1]))
>          (nil)))
> (insn 16 15 17 3 (parallel [
>              (set (mem:BLK (reg:HI 26 r26) [0  A8])
>                  (mem:BLK (reg:HI 30 r30) [0  A8]))
>              (unspec [
>                      (const_int 0 [0])
>                  ] UNSPEC_CPYMEM)
>              (use (reg:QI 22 r22 [52]))
>              (clobber (reg:HI 26 r26))
>              (clobber (reg:HI 30 r30))
>              (clobber (reg:QI 0 r0))
>              (clobber (reg:QI 22 r22 [52]))
>          ]) "pr53505.c":21:22 132 {cpymem_qi}
>       (nil))
>
> LRA generates insn 59 that clobbers r30 set in insn 14, causing an execution
> failure down the line.
>
> How should the avr backend deal with this?
>
Sorry for the big delay with the answer.  I was on vacation.

There are probably some ways to fix it by changing patterns as other 
people suggested but I'd like to see the current patterns work for LRA 
as well.

Could you send me the test case on which I could reproduce the problem 
and work on implementing such functionality.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: LRA for avr: Handling hard regs set directly at expand
  2023-08-02 16:54 ` Vladimir Makarov
@ 2023-08-03  5:53   ` SenthilKumar.Selvaraj
  0 siblings, 0 replies; 6+ messages in thread
From: SenthilKumar.Selvaraj @ 2023-08-03  5:53 UTC (permalink / raw)
  To: gcc, vmakarov

On Wed, 2023-08-02 at 12:54 -0400, Vladimir Makarov wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> On 7/17/23 07:33, SenthilKumar.Selvaraj@microchip.com wrote:
> > Hi,
> > 
> >    The avr target has a bunch of patterns that directly set hard regs at expand time, like so
> > 
> > (define_expand "cpymemhi"
> >    [(parallel [(set (match_operand:BLK 0 "memory_operand" "")
> >                     (match_operand:BLK 1 "memory_operand" ""))
> >                (use (match_operand:HI 2 "const_int_operand" ""))
> >                (use (match_operand:HI 3 "const_int_operand" ""))])]
> >    ""
> >    {
> >      if (avr_emit_cpymemhi (operands))
> >        DONE;
> > 
> >      FAIL;
> >    })
> > 
> > where avr_emit_cpymemhi generates
> > 
> > (insn 14 13 15 4 (set (reg:HI 30 r30)
> >          (reg:HI 48 [ ivtmp.10 ])) "pr53505.c":21:22 -1
> >       (nil))
> > (insn 15 14 16 4 (set (reg:HI 26 r26)
> >          (reg/f:HI 38 virtual-stack-vars)) "pr53505.c":21:22 -1
> >       (nil))
> > (insn 16 15 17 4 (parallel [
> >              (set (mem:BLK (reg:HI 26 r26) [0  A8])
> >                  (mem:BLK (reg:HI 30 r30) [0  A8]))
> >              (unspec [
> >                      (const_int 0 [0])
> >                  ] UNSPEC_CPYMEM)
> >              (use (reg:QI 52))
> >              (clobber (reg:HI 26 r26))
> >              (clobber (reg:HI 30 r30))
> >              (clobber (reg:QI 0 r0))
> >              (clobber (reg:QI 52))
> >          ]) "pr53505.c":21:22 -1
> >       (nil))
> > 
> > Classic reload knows about these - find_reg masks out bad_spill_regs, and bad_spill_regs
> > when ORed with chain->live_throughout in order_regs_for_reload picks up r30.
> > 
> > LRA, however, appears to not consider that, and proceeds to use such regs as reload regs.
> > For the same source, it generates
> > <snip>
> >   Choosing alt 0 in insn 15:  (0) =r  (1) r {*movhi_split}
> >        Creating newreg=70, assigning class GENERAL_REGS to r70
> >     15: r26:HI=r70:HI
> >        REG_EQUAL r28:HI+0x1
> >      Inserting insn reload before:
> >     58: r70:HI=r28:HI+0x1
> > 
> > Choosing alt 3 in insn 58:  (0) d  (1) 0  (2) nYnn {*addhi3_split}
> >        Creating newreg=71 from oldreg=70, assigning class LD_REGS to r71
> >     58: r71:HI=r71:HI+0x1
> >      Inserting insn reload before:
> >     59: r71:HI=r28:HI
> >      Inserting insn reload after:
> >     60: r70:HI=r71:HI
> > 
> > ********** Assignment #1: **********
> > 
> >        Assigning to 71 (cl=LD_REGS, orig=70, freq=3000, tfirst=71, tfreq=3000)...
> >          Assign 30 to reload r71 (freq=3000)
> >       Hard reg 26 is preferable by r70 with profit 1000
> >       Hard reg 30 is preferable by r70 with profit 1000
> >        Assigning to 70 (cl=GENERAL_REGS, orig=70, freq=2000, tfirst=70, tfreq=2000)...
> >          Assign 30 to reload r70 (freq=2000)
> > 
> > 
> > (insn 14 13 59 3 (set (reg:HI 30 r30)
> >          (reg:HI 18 r18 [orig:48 ivtmp.10 ] [48])) "pr53505.c":21:22 101 {*movhi_split}
> >       (nil))
> > (insn 59 14 58 3 (set (reg:HI 30 r30 [70])
> >          (reg/f:HI 28 r28)) "pr53505.c":21:22 101 {*movhi_split}
> >       (nil))
> > (insn 58 59 15 3 (set (reg:HI 30 r30 [70])
> >          (plus:HI (reg:HI 30 r30 [70])
> >              (const_int 1 [0x1]))) "pr53505.c":21:22 165 {*addhi3_split}
> >       (nil))
> > (insn 15 58 16 3 (set (reg:HI 26 r26)
> >          (reg:HI 30 r30 [70])) "pr53505.c":21:22 101 {*movhi_split}
> >       (expr_list:REG_EQUAL (plus:HI (reg/f:HI 28 r28)
> >              (const_int 1 [0x1]))
> >          (nil)))
> > (insn 16 15 17 3 (parallel [
> >              (set (mem:BLK (reg:HI 26 r26) [0  A8])
> >                  (mem:BLK (reg:HI 30 r30) [0  A8]))
> >              (unspec [
> >                      (const_int 0 [0])
> >                  ] UNSPEC_CPYMEM)
> >              (use (reg:QI 22 r22 [52]))
> >              (clobber (reg:HI 26 r26))
> >              (clobber (reg:HI 30 r30))
> >              (clobber (reg:QI 0 r0))
> >              (clobber (reg:QI 22 r22 [52]))
> >          ]) "pr53505.c":21:22 132 {cpymem_qi}
> >       (nil))
> > 
> > LRA generates insn 59 that clobbers r30 set in insn 14, causing an execution
> > failure down the line.
> > 
> > How should the avr backend deal with this?
> > 
> Sorry for the big delay with the answer.  I was on vacation.
> 
> There are probably some ways to fix it by changing patterns as other
> people suggested but I'd like to see the current patterns work for LRA
> as well.
> 
> Could you send me the test case on which I could reproduce the problem
> and work on implementing such functionality.
> 
> 
Thanks for taking your time to look at this.

To reproduce the behavior, apply the below patch on master

diff --git gcc/config/avr/avr.cc gcc/config/avr/avr.cc
index 25f3f4c22e0..a9ab8259339 100644
--- gcc/config/avr/avr.cc
+++ gcc/config/avr/avr.cc
@@ -1574,6 +1574,9 @@ avr_allocate_stack_slots_for_args (void)
 static bool
 avr_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to)
 {
+  if (lra_in_progress && to == STACK_POINTER_REGNUM)
+         return false;
+
   return ((frame_pointer_needed && to == FRAME_POINTER_REGNUM)
           || !frame_pointer_needed);
 }
@@ -15246,9 +15249,6 @@ avr_float_lib_compare_returns_bool (machine_mode mode, enum rtx_code)
 #undef  TARGET_CONVERT_TO_TYPE
 #define TARGET_CONVERT_TO_TYPE avr_convert_to_type
 
-#undef TARGET_LRA_P
-#define TARGET_LRA_P hook_bool_void_false
-
 #undef  TARGET_ADDR_SPACE_SUBSET_P
 #define TARGET_ADDR_SPACE_SUBSET_P avr_addr_space_subset_p
 
diff --git gcc/config/avr/avr.h gcc/config/avr/avr.h
index 8e7e00db13b..f82f429b94a 100644
--- gcc/config/avr/avr.h
+++ gcc/config/avr/avr.h
@@ -313,8 +313,7 @@ enum reg_class {
 #define ELIMINABLE_REGS {                                      \
     { ARG_POINTER_REGNUM, STACK_POINTER_REGNUM },               \
     { ARG_POINTER_REGNUM, FRAME_POINTER_REGNUM },               \
-    { FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM },             \
-    { FRAME_POINTER_REGNUM + 1, STACK_POINTER_REGNUM + 1 } }
+    { FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM } }
 
 #define INITIAL_ELIMINATION_OFFSET(FROM, TO, OFFSET)                   \
   OFFSET = avr_initial_elimination_offset (FROM, TO)

Then configure and build for avr without attempting to build libgcc

$ configure --target=avr --prefix=<prefix_dir> --enable-languages=c && make all-host && make install-host

And finally to reproduce the failure

$ cat test.c
#include <stdbool.h>

struct A
{
  unsigned int a;
  unsigned char c1, c2;
  bool b1 : 1;
};

void
foo (const struct A *x, int y)
{
  int s = 0, i;
  for (i = 0; i < y; ++i)
    {
      const struct A a = x[i];
      s += a.b1 ? 1 : 0;
    }
  if (s != 0)
    __builtin_abort ();
}

$ avr-gcc -mmcu=avr51 -Os test.c -S -o - -fdump-rtl-all
You can find the below insn sequence in test.c.*.reload

(insn 13 12 14 3 (set (reg:QI 22 r22 [52])
        (const_int 5 [0x5])) "test.c":16:22 86 {movqi_insn_split}
     (expr_list:REG_EQUAL (const_int 5 [0x5])
        (nil)))
(insn 14 13 59 3 (set (reg:HI 30 r30)
        (reg:HI 18 r18 [orig:48 ivtmp.9 ] [48])) "test.c":16:22 101 {*movhi_split}
     (nil))
(insn 59 14 58 3 (set (reg:HI 30 r30 [70])
        (reg/f:HI 28 r28)) "test.c":16:22 101 {*movhi_split}
     (nil))
(insn 58 59 15 3 (set (reg:HI 30 r30 [70])
        (plus:HI (reg:HI 30 r30 [70])
            (const_int 1 [0x1]))) "test.c":16:22 165 {*addhi3_split}
     (nil))
(insn 15 58 16 3 (set (reg:HI 26 r26)
        (reg:HI 30 r30 [70])) "test.c":16:22 101 {*movhi_split}
     (expr_list:REG_EQUAL (plus:HI (reg/f:HI 28 r28)
            (const_int 1 [0x1]))
        (nil)))
(insn 16 15 17 3 (parallel [
            (set (mem:BLK (reg:HI 26 r26) [0  A8])
                (mem:BLK (reg:HI 30 r30) [0  A8]))
            (unspec [
                    (const_int 0 [0])
                ] UNSPEC_CPYMEM)
            (use (reg:QI 22 r22 [52]))
            (clobber (reg:HI 26 r26))
            (clobber (reg:HI 30 r30))
            (clobber (reg:QI 0 r0))
            (clobber (reg:QI 22 r22 [52]))
        ]) "test.c":16:22 132 {cpymem_qi}
     (nil))

Classic reload produces

(insn 13 12 14 3 (set (reg:QI 22 r22 [52])
        (const_int 5 [0x5])) "test.c":16:22 86 {movqi_insn_split}
     (expr_list:REG_EQUAL (const_int 5 [0x5])
        (nil)))
(insn 14 13 59 3 (set (reg:HI 30 r30)
        (reg:HI 18 r18 [orig:48 ivtmp.9 ] [48])) "test.c":16:22 101 {*movhi_split}
     (nil))
(insn 59 14 15 3 (set (reg:HI 26 r26)
        (reg/f:HI 28 r28)) "test.c":16:22 101 {*movhi_split}
     (nil))
(insn 15 59 16 3 (set (reg:HI 26 r26)
        (plus:HI (reg:HI 26 r26)
            (const_int 1 [0x1]))) "test.c":16:22 165 {*addhi3_split}
     (expr_list:REG_EQUAL (plus:HI (reg/f:HI 28 r28)
            (const_int 1 [0x1]))
        (nil)))
(insn 16 15 17 3 (parallel [
            (set (mem:BLK (reg:HI 26 r26) [0  A8])
                (mem:BLK (reg:HI 30 r30) [0  A8]))
            (unspec [
                    (const_int 0 [0])
                ] UNSPEC_CPYMEM)
            (use (reg:QI 22 r22 [52]))
            (clobber (reg:HI 26 r26))
            (clobber (reg:HI 30 r30))
            (clobber (reg:QI 0 r0))
            (clobber (reg:QI 22 r22 [52]))
        ]) "test.c":16:22 132 {cpymem_qi}
     (nil))

Regards
Senthil

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-03  5:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-17 11:33 LRA for avr: Handling hard regs set directly at expand SenthilKumar.Selvaraj
2023-07-27 13:11 ` Georg-Johann Lay
2023-07-28  5:04   ` SenthilKumar.Selvaraj
2023-07-28  8:11     ` Georg-Johann Lay
2023-08-02 16:54 ` Vladimir Makarov
2023-08-03  5:53   ` SenthilKumar.Selvaraj

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).