public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Multiple types of load/store: how to create .md rules?
@ 2022-05-03  2:10 Andras Tantos
  2022-05-04 10:07 ` Julian Brown
  0 siblings, 1 reply; 2+ messages in thread
From: Andras Tantos @ 2022-05-03  2:10 UTC (permalink / raw)
  To: gcc

All,

Thanks for all the help from the past. I'm (still) working on porting
GCC to a new processor ISA and ran into the following problem: the CPU
supports two kinds of register+offset based loads (and stores).

The generic format accepts any base register and any offset. The syntax
for this type of operation is:

  $rX <- mem[$rB + <ofs>]

For code compatness reasons, there's another (shorter) form, that
accepts only $sp and $fp (the stack and frame pointers) as base
registers, and an offset in the range of -256 and 252. Finally this
form inly supports a transfer size of 32-bits. The syntax for this
format is:

  $rX <- mem[tiny $rB + <ofs>]

I would like to coerce GCC into using the 'tiny' form, whenever it can.
In order to do that, I've created a new predicate:

  (define_predicate "brew_tiny_memory_operand"
    (and
      (match_operand 0 "memory_operand")
      (match_test
        "MEM_P(op) && 
        brew_legitimate_tiny_address_p(XEXP(op, 0))"
      )
    )
  )

The function referenced in the predicate is as follows:

  static bool
  brew_reg_ok_for_tiny_base_p(const_rtx reg)
  {
    int regno = REGNO(reg);

    return
      regno == BREW_REG_FP ||
      regno == BREW_REG_SP ||
      regno == BREW_QFP ||
      regno == BREW_QAP;
  }

  bool
  brew_legitimate_tiny_address_p(rtx x)
  {
    if (
      GET_CODE(x) == PLUS &&
      REG_P(XEXP(x, 0)) &&
      brew_reg_ok_for_tiny_base_p(XEXP(x, 0)) &&
      CONST_INT_P(XEXP(x, 1)) &&
      IN_RANGE(INTVAL(XEXP(x, 1)), -256, 252) &&
      (INTVAL(XEXP(x,1)) & 3) == 0
    )
      return true;
    return false;
  }

Finally, I've created rules for the use of these new predicates:

  (define_expand "movsi"
    [(set
      (match_operand:SI 0 "general_operand" "")
      (match_operand:SI 1 "general_operand" "")
    )]
    ""
    "
  {
    if (!(reload_in_progress || reload_completed))
      {
        if(MEM_P(operands[0]))
          {
            // For stores, force the second arg. into a register
            operands[1] = force_reg(SImode, operands[1]);
            // We should make sure that the address
            // generated for the store is based on a
            // <reg>+<offset> pattern
            if(MEM_P(XEXP(operands[0], 0)))
              operands[0] = gen_rtx_MEM(
                SImode,
                force_reg(SImode, XEXP(operands[0], 0))
              );
          }
        else if(MEM_P(operands[1]))
          {
            // We should make sure that the address
            // generated for the load is based on a
            // <reg>+<offset> pattern
            if(MEM_P(XEXP (operands[1], 0)))
              operands[1] = gen_rtx_MEM(
                SImode,
                force_reg(SImode, XEXP(operands[1], 0))
              );
          }
      }
  }")

  (define_insn "*movsi_tiny_store"
    [(set
      (match_operand:SI 0 "brew_tiny_memory_operand"  "=m")
      (match_operand:SI 1 "register_operand"          "r")
    )]
    ""
    "mem[tiny %0] <- %1"
    [(set_attr "length" "2")]
  )

  (define_insn "*movsi_tiny_load"
    [(set
      (match_operand:SI 0 "register_operand"          "=r")
      (match_operand:SI 1 "brew_tiny_memory_operand"  "m")
    )]
    ""
    "%0 <- mem[tiny %1]"
    [(set_attr "length" "2")]
  )

  (define_insn "*movsi_general"
    [(set
      (match_operand:SI 0 "nonimmediate_operand"  "=r,r,r,r,m,r")
      (match_operand:SI 1 "general_operand"        "N,L,i,r,r,m")
    )]
    ""
    "@
    %0 <- tiny %1
    %0 <- short %1
    %0 <- %1
    %0 <- %1
    mem[%0] <- %1
    %0 <- mem[%1]"
    [(set_attr "length" "2,4,6,2,6,6")]
  )

When I tested this code, I've noticed a funny thing: the function
prologs and epilogs seem to use the 'tiny' versions of loads/stores
just fine. However (I think) some of the spills/reloads for local
variables end up using the extended version. Even more surprising is
that this behavior only manifests itself during optimized (-Os, -O1,
-O2) compilation. It seems that -O0 is free from this problem. Here's
one example:

    .file	"dtoa.c"
    .text
    .global	__udivsi3
    .p2align	1
    .global	quorem
    .type	quorem, @function
  quorem:
    mem[tiny $sp + -4] <- $fp ########### <------- OK
    mem[tiny $sp + -8] <- $lr
    mem[tiny $sp + -12] <- $r8
    mem[tiny $sp + -16] <- $r9
    mem[tiny $sp + -20] <- $r10
    mem[tiny $sp + -24] <- $r11
    $fp <- $sp
    $sp <- short $sp - 48
    $r9 <- $a0
    $r8 <- mem[$a1 + 16]
    $r0 <- mem[$a0 + 16]
    if signed $r0 < $r8 $pc <- .L7
    $r8 <- tiny $r8 + -1
    $r0 <- tiny 2
    $r0 <- $r8 << $r0
    $r10 <- short $a0 + 20
    $r1 <- $r10 + $r0
    mem[$fp + -32] <- $r1  ############## <-------- should be 'tiny'
    $r11 <- mem[$r1]

To a previous problem I've asked, Andrew Pinski replied that I should
merge all *movsi patterns into a single one to avoid (in that case)
strange deletions in the generated assembly. Is that possible here? It
appears to me that I would need the ability to differentiate the
different patterns using constraints, but is there a way to define
custom versions of the 'm' pattern? I didn't find anything on that in
the documentation. Did I miss something?

Thanks a bunch,
Andras



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Multiple types of load/store: how to create .md rules?
  2022-05-03  2:10 Multiple types of load/store: how to create .md rules? Andras Tantos
@ 2022-05-04 10:07 ` Julian Brown
  0 siblings, 0 replies; 2+ messages in thread
From: Julian Brown @ 2022-05-04 10:07 UTC (permalink / raw)
  To: gcc

On Mon, 02 May 2022 19:10:41 -0700
Andras Tantos <andras@tantosonline.com> wrote:

> To a previous problem I've asked, Andrew Pinski replied that I should
> merge all *movsi patterns into a single one to avoid (in that case)
> strange deletions in the generated assembly. Is that possible here? It
> appears to me that I would need the ability to differentiate the
> different patterns using constraints, but is there a way to define
> custom versions of the 'm' pattern? I didn't find anything on that in
> the documentation. Did I miss something?

Check "define_memory_constraint" in existing ports, i.e.:

https://gcc.gnu.org/onlinedocs/gccint/Define-Constraints.html#index-define_005fmemory_005fconstraint

HTH,

Julian



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-05-04 10:07 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-03  2:10 Multiple types of load/store: how to create .md rules? Andras Tantos
2022-05-04 10:07 ` Julian Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).