public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: Bernd Schmidt <bschmidt@redhat.com>, gcc-patches@gcc.gnu.org
Subject: Re: RTL frontend input format again (was Re: [PATCH 15/16] RTL frontend (rtl1), on top of dump reader)
Date: Fri, 07 Oct 2016 13:27:00 -0000	[thread overview]
Message-ID: <1475846819.30303.21.camel@redhat.com> (raw)
In-Reply-To: <4ed3a1bd-4efd-d2b8-1755-81ef22db96cc@redhat.com>

On Fri, 2016-10-07 at 12:38 +0200, Bernd Schmidt wrote:
> On 10/06/2016 09:53 PM, David Malcolm wrote:
> > A benefit of keeping the INSN_UIDs is that if you've spent half an
> > hour
> > single-stepping through RTL modification and know that INSN_UID
> > 1045 is
> > the one that gets corrupted, you can have that insn have UID 1045
> > in
> > the testcase.  Otherwise the UIDs get determined by the parser, and
> > can
> > change if other insns get inserted.
> 
> That's true, but you wouldn't construct a testcase until you're done 
> debugging, and this reason is lost for any testcase that has been 
> committed to the repository for more than say a week. I think we want
> a 
> format that is lean and abstracts away unnecessary details.
> > 
> > Here's a revised version of that example
> > (gcc/testsuite/rtl.dg/aarch64/pr71779.rtl), with the things you
> > mention
> > removed, and with some indentation to aid readability:
> > 
> > ;; { dg-do compile { target aarch64-*-* } }
> > ;; { dg-options "-fsingle-pass=rtl-cse1 -fdump-rtl-cse1" }
> > 
> > (function "fragment"
> >   (insn-chain
> >     ;; MEM[(struct isl_obj *)&obj1] = &isl_obj_map_vtable;
> >     (insn 2 (set (reg:SI 480)
> >                  (high:SI (symbol_ref:SI ("isl_obj_map_vtable")
> >                               [flags 0xc0] <var_decl
> > isl_obj_map_vtable>)))
> >       y.c:12702)
> >     (insn 2 (set (reg/f:SI 479)
> >                  (lo_sum:SI (reg:SI 480)
> >                             (symbol_ref:SI ("isl_obj_map_vtable")
> >                                [flags 0xc0]
> >                                <var_decl 0x7fa0363ea240
> > isl_obj_map_vtable>)))
> >       y.c:12702
> >       (expr_list:REG_EQUAL (symbol_ref:SI ("isl_obj_map_vtable")
> >                             [flags 0xc0]
> >                             <var_decl isl_obj_map_vtable>)))
> >     (insn 2 (set (reg:DI 481)
> >                  (subreg:DI (reg/f:SI 479) 0)) y.c:12702)
> >     (insn 2 (set (zero_extract:DI (reg/v:DI 191 [ obj1D.17368 ])
> >                                   (const_int 32 [0x20])
> >                                   (const_int 0 [0]))
> >                  (reg:DI 481)) y.c:12702)
> >     ;; Extra insn, to avoid all of the above from being deleted by
> > DCE
> >     (insn 2 (set (mem:DI (reg:DI 191) [1 i+0 S4 A32])
> >                          (const_int 1 [0x1])))
> > 
> >   ) ;; insn-chain
> > ) ;; function
> > 
> > This is with the optional cfg and crtl directives omitted.
> > 
> > Does the above look acceptable?
> 
> One thing this doesn't show is CODE_LABELs, which will need some sort
> of 
> unique identifier.

Note that all of my examples are hand-edited, so there may be mistakes,
inconsistencies, etc.

We could simply print the INSN_UID for CODE_LABELs; something like this
(see the "(code_label 16" below):

  (jump_insn (set (pc)
          (if_then_else (ge (reg:CCGC 17 flags)
                  (const_int 0 [0]))
              (label_ref 16)
              (pc))) "test.c":3
       
   -> 16)
  (note [bb 4] NOTE_INSN_BASIC_BLOCK)
  (insn (set (reg:SI 90)
        (mem/c:SI (plus:DI (reg/f:DI virtual-stack-vars)
                  (const_int -12 [0xfffffffffffffff4])) [1 k+0 S4 A32])) "test.c":4
       )
  (insn (parallel [
              (set (reg:SI 87 [ _1 ])
                  (plus:SI (reg:SI 90)
                      (const_int 4 [0x4])))
              (clobber (reg:CC 17 flags))
          ]) "test.c":4
       (expr_list:REG_EQUAL (plus:SI (mem/c:SI (plus:DI (reg/f:DI virtual-stack-vars)
                      (const_int -12 [0xfffffffffffffff4])) [1 k+0 S4 A32])
              (const_int 4 [0x4]))
          ))
  (jump_insn (set (pc) (label_ref 20)) "test.c":4       
   -> 20)
  (barrier)
  (code_label 16 [1 uses])
  (note [bb 5] NOTE_INSN_BASIC_BLOCK)

Should we drop the "[1 uses]" in the code_label?

> There's still the 0x7fa0363ea240 for the var-decl. You want the 
> dump-noaddr flag.

(indeed; I'm hand-editing the examples)

> > The "2" values after each "insn" are the basic block indices. 
> >  Should
> > these be omitted also?
> 
> I think they probably don't buy us much if we have block notes.

Unfortunately, the (code_label 16) above is in basic block *5*, not
basic_block 4 i.e. as you traverse the insn chaing the change of basic
block can happen before the NOTE_INSN_BASIC_BLOCK.  There is a barrier
and a code_label, but even so, this seems rather counter-intuitive to
me.

You appear to have trimmed the idea of enclosing the insns with (basic-block) directives without commenting on it.  Did you like this idea?
It would make the above look like:

  (basic-block 2
    ;; insns snipped
    (jump_insn (set (pc)
            (if_then_else (ge (reg:CCGC 17 flags)
                    (const_int 0 [0]))
                (label_ref 16)
                (pc))) "test.c":3
   -> 16)
  ) ;; basic-block 2
  (basic-block 4
    (note [bb 4] NOTE_INSN_BASIC_BLOCK)
    ;; insns snipped
    (jump_insn (set (pc) (label_ref 20)) "test.c":4       
     -> 20)
  ) ;; basic-block 4
  (barrier)
  (basic-block 5
    (code_label 16 [1 uses])
    (note [bb 5] NOTE_INSN_BASIC_BLOCK)
    ;; etc
  ) ;; basic-block 5

Note how the above format expresses clearly that:
* the (barrier) is part of the insn chain, but not in a basic block, and
* some insns can happen in a basic block

Taking this idea further: if we have (basic-block) directives integrated into the insn-chain like this, we could express the CFG by adding (edge) directives.  Here's a suggestion for doing it with (edge-from) and (edge-to) directives, expressing the predecessor and successor edges in the CFG, along with :

  (basic-block 2
    (edge-from 0)
    ;; insns snipped
    (jump_insn (set (pc)
            (if_then_else (ge (reg:CCGC 17 flags)
                    (const_int 0 [0]))
                (label_ref 16)
                (pc))) "test.c":3
   -> 16)
    (edge-to 4 (flags "FALLTHRU"))
    (edge-to 5 (flags ""))
  ) ;; basic-block 2
  (basic-block 4
    (edge-from 2 (flags "FALLTHRU"))
    (note [bb 4] NOTE_INSN_BASIC_BLOCK)
    ;; some insns snipped
    (jump_insn (set (pc) (label_ref 20)) "test.c":4       
     -> 20)
    (edge-to 6 (flags ""))
  ) ;; basic-block 4
  (barrier)
  (basic-block 5
    (edge-from 2 (flags ""))
    (code_label 16 [1 uses])
    (note [bb 5] NOTE_INSN_BASIC_BLOCK)
    ;; some insns snipped
    (edge-to 6 (flags ""))
  ) ;; basic-block 5


Do you like this approach?   It could also support insns that are on
edges (though I don't know if that's something we need to support in
dumps).

Should we spell "0" and "1" as "entry" and "exit" when parsing/dumping
basic block indices? e.g.:

  (basic-block 2
    (edge-from entry)

> What about file names/line numbers? We'll probably want some tests
> for 
> these, but for most others they will be irrelevant.

Yeah.  We cover this below.

> I think maybe you want a separate compact form of insns and notes 
> (cinsn/cnote maybe), with a flag selecting which gets written out in
> the 
> dumper. The reader could then read them in like any other rtx code,
> and 
> there'd be a postprocessing stage to reconstruct the insn chain.

By this separate compact form, do you mean the form we've been
discussing above, with no INSN_UID/PREV/NEXT, etc?  Or something else?

As for "cinsn", "cnote", how about just "insn" and "note", and having
the compactness be expressed at the top of the dump e.g. implicitly by
the presence of a "(function" directive.  Could even have a format
version specifier in the function directive, to give us some future
-proofing e.g.
  (function (format 20161007)
or somesuch.

> > My hope here is to that the input format is something that existing
> > gcc
> > developers will be comfortable reading, which is why I opted for
> > parsing the existing output - and I'm nervous about moving away too
> > much from the existing format.
> 
> Hey, I'm as existing a gcc developer as they come, and I don't much
> like 
> staring at unpruned rtl dumps. They've gotten worse over the years
> too.

Fair enough :)

Do you want to want to try hand-edited a test case, using some of the
format ideas we've been discussing?  That might suggest further
improvements to the format.

> > Some other possible changes: locations could do with a wrapper,
> > and/or
> > to quote the filename, since otherwise we can't cope with spaces in
> > filenames).  So instead of:
> > 
> >     (insn 2 (set (reg:DI 481)
> >                  (subreg:DI (reg/f:SI 479)
> > 0)) y.c:12702)
> > 
> > we could have:
> > 
> >     (insn 2 (set (reg:DI 481)
> >                  (subreg:DI (reg/f:SI 479) 0)) "y.c":12702)
> > 
> > or:
> > 
> >     (insn 2 (set (reg:DI 481)
> >                  (subreg:DI (reg/f:SI 479)
> > 0)) (srcloc "y.c" 12702))
> > 
> > Any preferences on these?
> 
> I suspect just putting it in quotes is good enough. Might be worth
> doing 
> that in general.

Good.  Let's quote the filenames.

> > > There was also a testcase where virtual regs still occurred with
> > > register number, I think it would be best to disallow this and
> > > add
> > > the
> > > ability to parse "(reg:DI virtual-outgoing-args)".
> > 
> > When we discussing pseudos you expressed a need to keep printing
> > the
> > regno for REGs, so that the dump format is usable in the debugger,
> > for
> > array indices etc.  Hence I'd prefer to keep printing the regno for
> > all
> > regnos, including virtuals.
> 
> This comes back to different needs for long-lived testcases vs. 
> debugging something. As suggested above, I think you need a flag for 
> different output formats, and for vregs the ability to parse "(reg:M 
> virtual-something)".

OK.  If so, do we need to print the regno for hard registers?  Or
should we just print the name for those, for consistency with virtual
regs?  (i.e. have it be controlled by the same flag?)

> > > Also I think I'd prefer testcases submitted separately from the
> > > RTL
> > > frontend. It seems we have two different frontend approaches
> > > here,
> > > one
> > > RTL frontend and one modification for the C frontend - which do
> > > you
> > > prefer?
> > 
> > Is it acceptable to do both?  I'd like to have an actual RTL
> > frontend,
> > though this punts on handling structs etc, whereas Richi seems to
> > prefer the modification to the C frontend, since it better supports
> > structs, aliasing, etc.  At this point, whichever gets me past
> > patch
> > review...
> 
> Does the C one have an advantage in terms of meaningful decls in 
> MEM_ATTRs/SYMBOL_REFs?

Probably.

Thanks
Dave

  reply	other threads:[~2016-10-07 13:27 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-05 15:44 [PATCH 00/16] RTL frontend (v3) David Malcolm
2016-10-05 15:44 ` [PATCH 11/16] df selftests (v3) David Malcolm
2016-10-05 15:44 ` [PATCH 01/16] read-md.c: Add various cleanups to ~rtx_reader David Malcolm
2016-10-05 15:51   ` Bernd Schmidt
2016-10-11 15:15     ` David Malcolm
2016-10-12 21:57       ` Richard Sandiford
2016-10-14 17:45         ` [PATCH] read-md.c: Move various state to within class rtx_reader David Malcolm
2016-10-17 11:05           ` Bernd Schmidt
2016-10-17 11:37           ` Richard Sandiford
2016-10-17 16:27             ` [PATCH] read-md.c: Move various state to within class rtx_reader (v3) David Malcolm
2016-10-17 20:23               ` Richard Sandiford
2016-10-05 15:44 ` [PATCH 08/16] (partially-approved): Introduce selftest::locate_file David Malcolm
2016-10-05 16:10   ` Bernd Schmidt
2016-10-05 15:44 ` [PATCH 07/16] read-md: add some helper functions David Malcolm
2016-10-05 15:57   ` Bernd Schmidt
2016-10-05 15:45 ` [PATCH 16/16] Add "__RTL" to cc1 David Malcolm
2016-10-05 16:10   ` Joseph Myers
2016-10-07 15:27     ` [PATCH] Add "__RTL" to cc1 (v2) David Malcolm
2016-10-13 13:49       ` Richard Biener
2016-10-13 13:52         ` Bernd Schmidt
2016-10-14  9:33           ` Richard Biener
2016-10-14  9:48             ` Bernd Schmidt
2016-10-14  9:50               ` Richard Biener
2016-10-14 19:25             ` David Malcolm
2016-10-14 19:27               ` Bernd Schmidt
2016-10-14 19:35                 ` David Malcolm
2016-10-14 19:23         ` David Malcolm
2016-10-05 15:45 ` [PATCH 03/16] (approved) selftest.h: add temp_override fixture David Malcolm
2016-10-05 15:45 ` [PATCH 14/16] RTL interpreter (work-in-progress) David Malcolm
2016-10-05 15:45 ` [PATCH 09/16] Split class rtx_reader into base_rtx_reader vs rtx_reader David Malcolm
2016-10-11 15:53   ` Bernd Schmidt
2016-10-18 20:30     ` David Malcolm
2016-10-19 14:45       ` Bernd Schmidt
2016-10-20 19:14         ` Richard Sandiford
2016-10-05 15:45 ` [PATCH 06/16] Introduce emit_status::ensure_regno_capacity David Malcolm
2016-10-05 15:55   ` Bernd Schmidt
2016-11-18 20:47     ` [PATCH] Introduce emit_status::ensure_regno_capacity (v5) David Malcolm
2016-11-22 13:34       ` Bernd Schmidt
2016-10-05 15:55   ` [PATCH 06/16] Introduce emit_status::ensure_regno_capacity Bernd Schmidt
2016-10-05 15:45 ` [PATCH 04/16] (approved) Expose forcibly_ggc_collect and run it after all selftests David Malcolm
2016-10-05 15:45 ` [PATCH 10/16] Introduce class function_reader (v3) David Malcolm
2016-10-05 16:00   ` Bernd Schmidt
2016-10-07 13:44     ` David Malcolm
2016-10-10 18:53       ` Richard Sandiford
2016-10-05 15:45 ` [PATCH 05/16] Introduce rtl_data::init_stack_alignment David Malcolm
2016-10-05 15:52   ` Bernd Schmidt
2016-10-05 15:45 ` [PATCH 15/16] RTL frontend (rtl1), on top of dump reader David Malcolm
2016-10-06 13:30   ` Bernd Schmidt
2016-10-06 19:53     ` RTL frontend input format again (was Re: [PATCH 15/16] RTL frontend (rtl1), on top of dump reader) David Malcolm
2016-10-06 19:59       ` David Malcolm
2016-10-07 10:38       ` Bernd Schmidt
2016-10-07 13:27         ` David Malcolm [this message]
2016-10-07 13:58           ` Bernd Schmidt
2016-10-07 18:08             ` David Malcolm
2016-10-12 10:45             ` [PATCH] print_rtx_function: integrate dumping of the CFG into the insn chain David Malcolm
2016-10-12 10:50               ` Bernd Schmidt
2016-10-12 17:17             ` [PATCH] Add a "compact" mode to print_rtx_function David Malcolm
2016-10-12 17:31               ` Bernd Schmidt
2016-10-12 20:06                 ` [PATCH] (v2) " David Malcolm
2016-10-13 10:21                   ` Bernd Schmidt
2016-10-13 15:22                     ` [PATCH] Omit INSN_LOCATION from compact dumps David Malcolm
2016-10-13 15:50                       ` Bernd Schmidt
2016-11-22 13:18                   ` [PATCH] (v2) Add a "compact" mode to print_rtx_function Dominik Vogt
2016-11-22 13:32                     ` Bernd Schmidt
2016-11-22 13:37                       ` Jakub Jelinek
2016-11-22 14:25                         ` David Malcolm
2016-11-22 14:39                           ` Dominik Vogt
2016-11-22 14:38                         ` Bernd Schmidt
2016-11-22 14:45                           ` Jakub Jelinek
2016-11-22 15:38                             ` David Malcolm
2016-11-25 16:37                               ` Dominik Vogt
2016-12-01 10:13                               ` [PING] " Dominik Vogt
2016-12-01 12:28                                 ` Bernd Schmidt
2016-12-02 12:36                                   ` Andreas Krebbel
2016-10-12 20:33                 ` [PATCH] Tweaks " David Malcolm
2016-10-13 10:24                   ` Bernd Schmidt
2016-10-13 14:08                     ` David Malcolm
2016-10-13 14:18                       ` Bernd Schmidt
2016-10-14 19:41                         ` [PATCH] (v2) " David Malcolm
2016-10-14 20:07                           ` Bernd Schmidt
2016-10-19 14:36             ` RTL frontend input format again (was Re: [PATCH 15/16] RTL frontend (rtl1), on top of dump reader) David Malcolm
2016-10-19 14:42               ` Bernd Schmidt
2016-10-19 17:19                 ` David Malcolm
2016-10-19 17:22                   ` Bernd Schmidt
2016-10-19 17:54                     ` David Malcolm
2016-10-20 13:55     ` INSN_UIDs " David Malcolm
2016-10-20 14:11       ` Bernd Schmidt
2016-10-20 14:20         ` David Malcolm
2016-10-20 14:22           ` Bernd Schmidt
2016-10-26 18:19         ` [PATCH] Show INSN_UIDs in compact mode David Malcolm
2016-10-26 18:20           ` Bernd Schmidt
2016-10-06 15:24   ` [PATCH 15/16] RTL frontend (rtl1), on top of dump reader Bernd Schmidt
2016-10-07 16:22     ` [PATCH] RTL frontend (rtl1), on top of dump reader (v4) David Malcolm
2016-10-05 15:45 ` [PATCH 12/16] combine.c selftests (v2) David Malcolm
2016-10-05 15:45 ` [PATCH 02/16] (approved) Add selftest::read_file David Malcolm
2016-10-05 15:45 ` [PATCH 13/16] cse.c selftests David Malcolm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1475846819.30303.21.camel@redhat.com \
    --to=dmalcolm@redhat.com \
    --cc=bschmidt@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).