public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: David Malcolm <dmalcolm@redhat.com>
To: Richard Sandiford <rdsandiford@googlemail.com>
Cc: Steven Bosscher <stevenb.gcc@gmail.com>,
	       GCC Patches	 <gcc-patches@gcc.gnu.org>
Subject: Instructions vs Expressions in the backend (was Re: RFA: Rework FOR_BB_INSNS iterators)
Date: Mon, 23 Jun 2014 19:01:00 -0000	[thread overview]
Message-ID: <1403549659.16446.43.camel@surprise> (raw)
In-Reply-To: <87d2ehq3p6.fsf@talisman.default>

On Mon, 2014-06-09 at 20:32 +0100, Richard Sandiford wrote:
> Steven Bosscher <stevenb.gcc@gmail.com> writes:
> > On Sat, Jun 7, 2014 at 7:54 PM, Richard Sandiford wrote:
> >> The two parts of the loop condition are really handling two different
> >> kinds of block: ones like entry and exit that are completely empty
> >> and normal ones that have at least a block note.  There's no real
> >> need to check for null INSNs in normal blocks.
> >
> > Block notes should go away some day, they're just remains of a time
> > when there was no actual CFG in the compiler.
> 
> Yeah.  I suppose when that happens empty blocks would look just like
> entry and exit as far as these iterators go.
> 
> >> Also, refetching NEXT_INSN (BB_END (BB)) for each iteration can be expensive.
> >> If we're prepared to say that the loop body can't insert instructions
> >> for another block immediately after BB_END,
> >
> > This can happen with "block_label()" if e.g. a new jump is inserted
> > for one reason or another. Not very likely for passes working in
> > cfglayout mode, but post-RA there may be places that need this
> > (splitters, peepholes, machine dependent reorgs, etc.).
> >
> > So even if we're prepared to say what you suggest, I don't think you
> > can easily enforce it.
> 
> Probably true.  But if we want to allow insertions after BB_END,
> we need to make FOR_BB_INSNS_SAFE handle that for INSN == BB_END too.
> 
> The alternative definition:
> 
>   for (rtx INSN = BB_HEAD (BB), INSN##_cond_ = INSN, INSN##_next_;	\
>        INSN##_cond_ 							\
> 	 && (INSN##_next_ = NEXT_INSN (INSN),				\
> 	     INSN##_cond_ = BB_END (BB),				\
> 	     true);							\
>        INSN##_cond_ = (INSN == INSN##_cond_ ? NULL_RTX : (rtx) 1),	\
>        INSN = INSN##_next_)
> 
> works too.  It isn't quite as fast, but obviously correctness comes first.
> 
> >> It's easier to change these macros if they define the INSN variables
> >> themselves.
> >
> > If you're going to hack these iterators anyway (much appreciated BTW),
> > I suggest to make them similar to the gsi, loop, edge, and bitmap
> > iterators: A new "insn_iterator" structure to hold the variables and
> > static inline functions wrapped in the macros. This will also be
> > helpful if (when) we ever manage to make the type for an insn a
> > non-rtx...
> 
> Well, with this and...
> 
> >> +/* For iterating over insns in a basic block.  The iterator allows the loop
> >> +   body to delete INSN.  It also ignores any instructions that the body
> >> +   inserts between INSN and the following instruction.  */
> >> +#define FOR_BB_INSNS(BB, INSN)                                         \
> >> +  for (rtx INSN = BB_HEAD (BB), INSN##_cond_ = INSN, INSN##_next_,     \
> >> + INSN##_end_ = INSN ? NEXT_INSN (BB_END (BB)) : NULL_RTX; \
> >> + INSN##_cond_ && (INSN##_next_ = NEXT_INSN (INSN), true); \
> >> +       INSN = INSN##_next_,                                            \
> >> +       INSN##_cond_ = (INSN != INSN##_end_ ? (rtx) 1 : NULL_RTX))
> >
> > This just makes my eyes hurt...
> >
> >
> > What about cases where a FOR_BB_INSNS is terminated before reaching
> > the end of a basic block, and you need to know at what insn you
> > stopped? Up to now, you could do:
> >
> >   rtx insn; basic_block bb;
> >   FOR_BB_INSNS (bb, insn)
> >     {
> >       ... // do stuff
> >       if (something) break;
> >     }
> >   do_something_with (insn);
> >
> > Looks like this is no longer possible with the implementation of
> > FOR_BB_INSNS of your patch.
> 
> ...this I suppose it depends where we want to go with these iterators.
> I'm hoping eventually we'll move to C++11, where the natural way of
> writing the loop would be:
> 
>   for (rtx insn : bb->insns ())
> 
> (Or "auto *" instead of "rtx" if you prefer.)
> 
> And I think the idiom of having the FOR_* macro define the iterator
> variable is much closer to that than:
> 
>   rtx insn;
>   FOR_BB_INSNS (iter, bb, insn)
> 
> would be.
> 
> It's true that you can't leave "insn" with a signficant value after
> the loop, but no current code wants to do that.  Personally I like
> the fact that loops that do want to set a final value have to make
> that explicit.  When the vast majority (currently all) instances of:
> 
>   FOR_BB_INSNS (bb, insn)
> 
> treat "insn" as local to the loop, it's helpful when the exceptions
> are obvious.
> 
> I think if anything the patch would make it easier to change the
> type of insns to something other than rtx.

(sorry for the belated response; I was on vacation).

FWIW I'm actually working on such a change, or, at least, to make
instructions be a subclass of rtx_def; presumably this should make it
easier to make it an entirely separate type, if that's desirable.

Executive summary is that it's still a work in progress, and I'm going
to be giving a talk about it at Cauldron next month ("A proposal for
typesafe RTL", currently scheduled for the Sunday at 12.45 in Stream 2).

More detailed version:
I experimented with this a few months ago, and came up with a 159-patch
patch series:
http://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v1/

That patch kit builds on x86_64 (and regrtests iirc), and uses a
separate subclass for insns e.g. in the core basic block class, in the
scheduler, register allocators, etc, but:
  (A) it only builds on about 70 of the ~200 configurations in
config-list.mk
  (B) there was a tangled mess of dependencies between the patches
making it hard to fix (A)
  (C) I went with the rather verbose "rtx_base_insn" name, which in v1
is a typedef for a pointer to an insn, along with a family of other
typedefs, which I now realize is frowned upon e.g.:
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01520.html
https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01632.html
  (D) I was using as_a *methods* rather than just as_a <>, like
in v1 of my gimple-classes patches:
(https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00887.html)


So I've been reworking it, incorporating feedback from the gimple
statement classes work; the latest patch series is v7 and can be seen
here:
http://dmalcolm.fedorapeople.org/gcc/patch-backups/rtx-classes/v7/

It builds and regrtests on x86_64 on top of r211061 (aka
dcd5393fd5b3ac53775a5546f3103958b64789bb) and builds on 184 of the
configurations in config-list.mk; I have patches to fix the remaining
ones to achieve parity with an unpatched build.

The current class hierarchy looks like this:
/* Subclasses of rtx_def, using indentation to show the class
   hierarchy, along with the relevant invariant.  */
class rtx_def;
  class rtx_insn; /* (INSN_P (X) || NOTE_P (X)
	               || JUMP_TABLE_DATA_P (X) || BARRIER_P (X)
	               || LABEL_P (X)) */
    class rtx_real_insn;         /* INSN_P (X) */
      class rtx_debug_insn;      /* DEBUG_INSN_P (X) */
      class rtx_nonjump_insn;    /* NONJUMP_INSN_P (X) */
      class rtx_jump_insn;       /* JUMP_P (X) */
      class rtx_call_insn;       /* CALL_P (X) */
    class rtx_jump_table_data;   /* JUMP_TABLE_DATA_P (X) */
    class rtx_barrier;           /* BARRIER_P (X) */
    class rtx_code_label;        /* LABEL_P (X) */
    class rtx_note;              /* NOTE_P (X) */

and the code now uses "rtx_insn *" in hundreds of places where it would
previously have used "rtx".

My aim is that it always builds on every config: each patch is
self-contained.  To do this, the initial patches add scaffolding that
adds some strategically-placed checked downcasts to rtx_insn * (e.g. on
the result of PREV_INSN/NEXT_INSN).  The subsequent patches then use
these promises in order to strengthen the insn-handling code to work on
the new subclass. The idea is that the final patches then remove this
scaffolding, with the core BB types becoming rtx_insn *.  The drawback
of course is that during the process the resulting compiler is much
slower - until the scaffolding is removed.

v1 of the patch series also added some non-insn subclasses of rtx, for
EXPR_LIST, INSN_LIST, and for SEQUENCE, allowing rewriting of iterations
like this (in jump.c:rebuild_jump_labels_1), where "insn" and
"forced_labels" are INSN_LIST and hence are "rtx_insn_list *":

-    for (insn = forced_labels; insn; insn = XEXP (insn, 1))
-      if (LABEL_P (XEXP (insn, 0)))
-	  LABEL_NUSES (XEXP (insn, 0))++;
+    for (insn = forced_labels; insn; insn = insn->next ())
+      if (LABEL_P (insn->element ()))
+	  LABEL_NUSES (insn->element ())++;

Thoughts?
Dave

> It would just mean changing
> the "rtx" at the start of the two "for" loops to the new type,
> whereas at the moment you would need to change all the separate
> "rtx insn"s.  (In particular, it's common for "insn" to be defined
> on the same line as other rtx variables that might need to stay
> as "rtx"es after the change.)



  reply	other threads:[~2014-06-23 19:01 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-07 17:54 RFA: Rework FOR_BB_INSNS iterators Richard Sandiford
2014-06-07 20:26 ` Steven Bosscher
2014-06-09 19:32   ` Richard Sandiford
2014-06-23 19:01     ` David Malcolm [this message]
2014-06-23 20:38       ` Instructions vs Expressions in the backend (was Re: RFA: Rework FOR_BB_INSNS iterators) Oleg Endo
2014-06-25  9:36         ` Richard Sandiford
2014-06-25 20:39           ` Jeff Law
2014-06-27 14:28             ` David Malcolm
2014-06-27 15:38               ` Jeff Law
2014-06-27  7:36           ` Oleg Endo
2014-06-27 14:35           ` David Malcolm
2014-06-25  8:54       ` Richard Sandiford
2014-06-25 20:46         ` Jeff Law
2014-06-25 21:24           ` Steven Bosscher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1403549659.16446.43.camel@surprise \
    --to=dmalcolm@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=rdsandiford@googlemail.com \
    --cc=stevenb.gcc@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).