public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Robert Dubner <rdubner@symas.com>
Cc: GCC Mailing List <gcc@gcc.gnu.org>
Subject: Re: New feature: -fdump-gimple-nodes (once more, with feeling)
Date: Wed, 14 Feb 2024 11:24:06 +0100	[thread overview]
Message-ID: <CAFiYyc2PkmJPc_yRmxuVv=oPrmPziVbZGygX7K9tgqR1XM87RA@mail.gmail.com> (raw)
In-Reply-To: <094e01da5eb5$4c598e40$e50caac0$@symas.com>

On Tue, Feb 13, 2024 at 8:47 PM Robert Dubner <rdubner@symas.com> wrote:
>
> I have not contributed to GCC before, so I am not totally sure how to go
> about it.
>
> So, I am letting you know what I want to do, so that I can get advice on a
> good way to do it.  I have read https://gcc.gnu.org/contribute.html, and I
> have reviewed the Gnu Coding Standards and the GCC additional coding
> standards, so I have some idea of what's needed.  But there is a gulf
> between theory and practice, and I am hoping for guidance.
>
> Jim Lowden and I have been developing a COBOL front end for GCC.  He's
> primarily been parsing the language.  It's been my task to generate the
> GENERIC/GIMPLE trees for the parsed code.  We've been working at this for
> a couple of years.  We have reached the point where we want to start
> submitting patches for the community to evaluate.
>
> I figured I would start small, where "small" means mainly one new source
> code file of 1,580 lines.
>
> When I first started trying to generate GIMPLE trees to implement
> functions, it became clear to me that I needed to be able to
> reverse-engineer known good trees generated by the C front end.  Oh, I
> could see what other front ends were doing in their source code.  But I
> didn't know what the goal was.  I wanted to see not just individual nodes,
> but how they all related to each other.
>
> There didn't seem to be any such functionality in GCC.  I found a routine
> in print-tree.cc which printed out a single node, but I needed to
> understand the entire tree of nodes for a function.  And I very quickly
> got tired -- very tired -- of trying to figure out the relationships
> between nodes, and I wanted more information than the print-tree routines
> were providing.
>
> So, I created the gcc/dump-gimple-nodes.cc source code, which implements
> the dump_gimple_nodes() function, which is controlled by the new
> -fdump-gimple-nodes GCC command-line option.  That option hooks into the
> top of the gimplify_function_tree() function in gcc/gimplify.cc.

A first comment is that you seem to dump the GENERIC graph the frontend
feeds to the gimplifier.  So this isn't GIMPLE just yet, so it possibly should
be dump_generic_nodes ().

We dump a textual representation at a similar state with -fdump-tree-original.
There's a -raw modifier that for example for C streams

;; Function main (null)
;; enabled by -tree-original

@1      statement_list   0   : @2       1   : @3
@2      bind_expr        type: @4       body: @5
@3      return_expr      type: @4       expr: @6
@4      void_type        name: @7       algn: 8
@5      statement_list
@6      modify_expr      type: @8       op 0: @9       op 1: @10
@7      type_decl        name: @11      type: @4
@8      integer_type     name: @12      size: @13      algn: 32
                         prec: 32       sign: signed   min : @14
                         max : @15
...

I didn't track down where the C frontend triggers this or what utility
it uses in the
end.  It is also somewhat frontend specific, likely before genericization.

I agree with Andi that these days sth more structured might be preferable
(but your html example might be good to parse and click through for a human)

> The dump_gimple_nodes() function does a depth-first walk of the specified
> function_decl, outputting each node once in a readable format.  Each node
> gets an arbitrary identifying number.  There are two output files; the
> first, "func_name.nodes", is pure text.  After I got tired of endlessly
> searching through the text file for the next node of interest, I created
> the "func_name.nodes.html" file, which is the same information with
> internal hyperlinks between the nodes.
>
> Here are the first two nodes of a typical simple function:
>
> ***********************************This is NodeNumber0
> (0x7f12e13b0d00) NodeNumber0
> tree_code: function_decl
> tree_code_class: tcc_declaration
> base_flags: static public
> type: NodeNumber1 function_type
> name: NodeNumber6410 identifier_node "main"
> context: NodeNumber107 translation_unit_decl "bigger.c"
> source_location: bigger.c:7:5
> uid: 3663
> initial(bindings): NodeNumber6411 block
> machine_mode: QI(15)
> align: 8
> warn_if_not_align: 0
> pt_uid: 3663
> raw_assembler_name: NodeNumber6410 identifier_node "main"
> visibility: default
> result: NodeNumber6412 result_decl
> function(pointer): 0x7f12e135d508
> arguments: NodeNumber6413 parm_decl "argc"
> saved_tree(function_body): NodeNumber6417 statement_list
> function_code: 0
> function_flags: public no_instrument_function_entry_exit
> ***********************************This is NodeNumber1
> (0x7f12e13b3d20) NodeNumber1
> tree_code: function_type
> tree_code_class: tcc_type
> machine_mode: QI(15)
> type: NodeNumber2 integer_type
> address_space:0
> size(in bits): NodeNumber55 uint128 8
> size_unit(in bytes): NodeNumber12 uint64 1
> uid: 1515
> precision: 0
> contains_placeholder: 0
> align: 8
> warn_if_not_align: 0
> alias_set_type: -1
> canonical: NodeNumber1 function_type
> main_variant: NodeNumber1 function_type
> values: NodeNumber6408 tree_list
> ***********************************
>
> Note how even when an attribute points to another node, e.g.,
>
> arguments: NodeNumber6413 parm_decl "argc"
>
> the output routine goes down another level or two in an attempt to make it
> more meaningful.  The attribute points just to NodeNumber6413, but the
> output shows that node to be a parm_decl, and there is additional code
> that recognizes that a parm_decl has an identifier_node with the value
> "argc".
>
> An example of a complete dump is available at
> https://www.dubner.com/main.nodes.html.  The C source code that generated
> it is available at the end of
> https://cobolworx.com/pages/dump-gimple-nodes.html
>
> I found this feature to be absolutely necessary when figuring out how
> working front ends built valid GIMPLE trees for functions.  I am hopeful
> other developers can see the utility.
>
> Does this require any further discussion?  Or is my next step to start
> developing the series of patches that will create the dump-gimple-nodes
> source code, and that will modify Makefile.in, gimplify.cc, and common.opt
> to incorporate it?
>
> Thanks so much for any suggestions and guidance,
>
> Bob Dubner
>

  parent reply	other threads:[~2024-02-14 10:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-13 19:46 Robert Dubner
2024-02-14  7:40 ` Andi Kleen
2024-02-14 14:10   ` David Malcolm
2024-02-16 14:42   ` Florian Weimer
2024-02-14 10:24 ` Richard Biener [this message]
2024-02-14 16:31 ` Dimitar Dimitrov
2024-02-14 21:41   ` Robert Dubner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFiYyc2PkmJPc_yRmxuVv=oPrmPziVbZGygX7K9tgqR1XM87RA@mail.gmail.com' \
    --to=richard.guenther@gmail.com \
    --cc=gcc@gcc.gnu.org \
    --cc=rdubner@symas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).