public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Mapping of TREE_CODE to tree_node
@ 2023-06-26 16:59 Aaron Lorey
  2023-06-26 17:45 ` Andrew Pinski
  2023-06-26 18:08 ` David Malcolm
  0 siblings, 2 replies; 7+ messages in thread
From: Aaron Lorey @ 2023-06-26 16:59 UTC (permalink / raw)
  To: gcc

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

Hello,

this is the first time I am writing to a mailing list. I've tried
researching the normal procedure but nothing special seems to be required.

I'm currently trying to do a complete graph-discovery of GCC's symtab /
tree_nodes to dump the full internal representation of the compilation
unit. Gitlab: https://gitlab.com/graph-prog/code-database

It is not exceptionally heavy but also not very easy to serialize the
internal state to disk. I think this task was simply not considered in the
design.

Reason for writing to the mailing list are the troubles in connecting the
TREE_CODE enumeration to the appropriate struct tree_node memory layout
without guessing.

Can you provide a mapping of TREE_CODE to tree_node memory layout?

kind regards

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Mapping of TREE_CODE to tree_node
  2023-06-26 16:59 Mapping of TREE_CODE to tree_node Aaron Lorey
@ 2023-06-26 17:45 ` Andrew Pinski
  2023-06-26 18:08 ` David Malcolm
  1 sibling, 0 replies; 7+ messages in thread
From: Andrew Pinski @ 2023-06-26 17:45 UTC (permalink / raw)
  To: Aaron Lorey; +Cc: gcc

On Mon, Jun 26, 2023 at 10:01 AM Aaron Lorey via Gcc <gcc@gcc.gnu.org> wrote:
>
> Hello,
>
> this is the first time I am writing to a mailing list. I've tried
> researching the normal procedure but nothing special seems to be required.
>
> I'm currently trying to do a complete graph-discovery of GCC's symtab /
> tree_nodes to dump the full internal representation of the compilation
> unit. Gitlab: https://gitlab.com/graph-prog/code-database
>
> It is not exceptionally heavy but also not very easy to serialize the
> internal state to disk. I think this task was simply not considered in the
> design.
>
> Reason for writing to the mailing list are the troubles in connecting the
> TREE_CODE enumeration to the appropriate struct tree_node memory layout
> without guessing.
>
> Can you provide a mapping of TREE_CODE to tree_node memory layout?

See tree_node_structure_for_code and tree_node's GTY marker for tag.
e.g:
  struct tree_string GTY ((tag ("TS_STRING"))) string;

Says this is used when the tag is TS_STRING.
and TS_STRING is used for tree code STRING_CST:
    case STRING_CST:            return TS_STRING;

For front-end specific trees there is a front-end specific function
which does the mapping for those too.

Thanks,
Andrew Pinski

>
> kind regards

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Mapping of TREE_CODE to tree_node
  2023-06-26 16:59 Mapping of TREE_CODE to tree_node Aaron Lorey
  2023-06-26 17:45 ` Andrew Pinski
@ 2023-06-26 18:08 ` David Malcolm
  2023-07-03  0:46   ` Aaron Lorey
  1 sibling, 1 reply; 7+ messages in thread
From: David Malcolm @ 2023-06-26 18:08 UTC (permalink / raw)
  To: Aaron Lorey, gcc

On Mon, 2023-06-26 at 18:59 +0200, Aaron Lorey via Gcc wrote:
> Hello,
> 
> this is the first time I am writing to a mailing list. I've tried
> researching the normal procedure but nothing special seems to be
> required.
> 
> I'm currently trying to do a complete graph-discovery of GCC's symtab
> /
> tree_nodes to dump the full internal representation of the
> compilation
> unit. Gitlab: https://gitlab.com/graph-prog/code-database
> 
> It is not exceptionally heavy but also not very easy to serialize the
> internal state to disk. I think this task was simply not considered
> in the
> design.
> 
> Reason for writing to the mailing list are the troubles in connecting
> the
> TREE_CODE enumeration to the appropriate struct tree_node memory
> layout
> without guessing.
> 
> Can you provide a mapping of TREE_CODE to tree_node memory layout?

I don't know that such a mapping exists directly, but have a look at
the functions "tree_code_size" and "tree_size" defined in gcc/tree.cc.

You might also find the LTO streaming code of interest; see gcc/lto-
streamer-{in,out}.cc

Hope this is helpful
Dave



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Mapping of TREE_CODE to tree_node
  2023-06-26 18:08 ` David Malcolm
@ 2023-07-03  0:46   ` Aaron Lorey
  2023-07-03  0:50     ` Andrew Pinski
  0 siblings, 1 reply; 7+ messages in thread
From: Aaron Lorey @ 2023-07-03  0:46 UTC (permalink / raw)
  To: David Malcolm; +Cc: gcc

Am Mo., 26. Juni 2023 um 20:09 Uhr schrieb David Malcolm <dmalcolm@redhat.com>:
>
> On Mon, 2023-06-26 at 18:59 +0200, Aaron Lorey via Gcc wrote:
> > Hello,
> >
> > this is the first time I am writing to a mailing list. I've tried
> > researching the normal procedure but nothing special seems to be
> > required.
> >
> > I'm currently trying to do a complete graph-discovery of GCC's symtab
> > /
> > tree_nodes to dump the full internal representation of the
> > compilation
> > unit. Gitlab: https://gitlab.com/graph-prog/code-database
> >
> > It is not exceptionally heavy but also not very easy to serialize the
> > internal state to disk. I think this task was simply not considered
> > in the
> > design.
> >
> > Reason for writing to the mailing list are the troubles in connecting
> > the
> > TREE_CODE enumeration to the appropriate struct tree_node memory
> > layout
> > without guessing.
> >
> > Can you provide a mapping of TREE_CODE to tree_node memory layout?
>
> I don't know that such a mapping exists directly, but have a look at
> the functions "tree_code_size" and "tree_size" defined in gcc/tree.cc.
>
> You might also find the LTO streaming code of interest; see gcc/lto-
> streamer-{in,out}.cc
>
> Hope this is helpful
> Dave
>
>

Thank you for your reply.

The tree_size() and tree_code_size() functions are useful, although incomplete.

If I understand correctly, the link time optimization works on the
GIMPLE representation. The original syntax tree and symbol table would
be preferable.

Andrew's suggestion might be more what I'm looking for.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Mapping of TREE_CODE to tree_node
  2023-07-03  0:46   ` Aaron Lorey
@ 2023-07-03  0:50     ` Andrew Pinski
  2023-08-11 23:30       ` Aaron Lorey
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Pinski @ 2023-07-03  0:50 UTC (permalink / raw)
  To: Aaron Lorey; +Cc: David Malcolm, gcc

On Sun, Jul 2, 2023 at 5:48 PM Aaron Lorey via Gcc <gcc@gcc.gnu.org> wrote:
>
> Am Mo., 26. Juni 2023 um 20:09 Uhr schrieb David Malcolm <dmalcolm@redhat.com>:
> >
> > On Mon, 2023-06-26 at 18:59 +0200, Aaron Lorey via Gcc wrote:
> > > Hello,
> > >
> > > this is the first time I am writing to a mailing list. I've tried
> > > researching the normal procedure but nothing special seems to be
> > > required.
> > >
> > > I'm currently trying to do a complete graph-discovery of GCC's symtab
> > > /
> > > tree_nodes to dump the full internal representation of the
> > > compilation
> > > unit. Gitlab: https://gitlab.com/graph-prog/code-database
> > >
> > > It is not exceptionally heavy but also not very easy to serialize the
> > > internal state to disk. I think this task was simply not considered
> > > in the
> > > design.
> > >
> > > Reason for writing to the mailing list are the troubles in connecting
> > > the
> > > TREE_CODE enumeration to the appropriate struct tree_node memory
> > > layout
> > > without guessing.
> > >
> > > Can you provide a mapping of TREE_CODE to tree_node memory layout?
> >
> > I don't know that such a mapping exists directly, but have a look at
> > the functions "tree_code_size" and "tree_size" defined in gcc/tree.cc.
> >
> > You might also find the LTO streaming code of interest; see gcc/lto-
> > streamer-{in,out}.cc
> >
> > Hope this is helpful
> > Dave
> >
> >
>
> Thank you for your reply.
>
> The tree_size() and tree_code_size() functions are useful, although incomplete.
>
> If I understand correctly, the link time optimization works on the
> GIMPLE representation. The original syntax tree and symbol table would
> be preferable.

You could also look into the module support in the C++ front-end,
`gcc/cp/module.cc ` which does store out the original trees and such.

Thanks,
Andrew

>
> Andrew's suggestion might be more what I'm looking for.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Mapping of TREE_CODE to tree_node
  2023-07-03  0:50     ` Andrew Pinski
@ 2023-08-11 23:30       ` Aaron Lorey
  2023-08-15 20:00         ` Jason Merrill
  0 siblings, 1 reply; 7+ messages in thread
From: Aaron Lorey @ 2023-08-11 23:30 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: David Malcolm, gcc

Am Mo., 3. Juli 2023 um 02:50 Uhr schrieb Andrew Pinski <pinskia@gmail.com>:
>
> On Sun, Jul 2, 2023 at 5:48 PM Aaron Lorey via Gcc <gcc@gcc.gnu.org> wrote:
> >
> > Am Mo., 26. Juni 2023 um 20:09 Uhr schrieb David Malcolm <dmalcolm@redhat.com>:
> > >
> > > On Mon, 2023-06-26 at 18:59 +0200, Aaron Lorey via Gcc wrote:
> > > > Hello,
> > > >
> > > > this is the first time I am writing to a mailing list. I've tried
> > > > researching the normal procedure but nothing special seems to be
> > > > required.
> > > >
> > > > I'm currently trying to do a complete graph-discovery of GCC's symtab
> > > > /
> > > > tree_nodes to dump the full internal representation of the
> > > > compilation
> > > > unit. Gitlab: https://gitlab.com/graph-prog/code-database
> > > >
> > > > It is not exceptionally heavy but also not very easy to serialize the
> > > > internal state to disk. I think this task was simply not considered
> > > > in the
> > > > design.
> > > >
> > > > Reason for writing to the mailing list are the troubles in connecting
> > > > the
> > > > TREE_CODE enumeration to the appropriate struct tree_node memory
> > > > layout
> > > > without guessing.
> > > >
> > > > Can you provide a mapping of TREE_CODE to tree_node memory layout?
> > >
> > > I don't know that such a mapping exists directly, but have a look at
> > > the functions "tree_code_size" and "tree_size" defined in gcc/tree.cc.
> > >
> > > You might also find the LTO streaming code of interest; see gcc/lto-
> > > streamer-{in,out}.cc
> > >
> > > Hope this is helpful
> > > Dave
> > >
> > >
> >
> > Thank you for your reply.
> >
> > The tree_size() and tree_code_size() functions are useful, although incomplete.
> >
> > If I understand correctly, the link time optimization works on the
> > GIMPLE representation. The original syntax tree and symbol table would
> > be preferable.
>
> You could also look into the module support in the C++ front-end,
> `gcc/cp/module.cc ` which does store out the original trees and such.
>
> Thanks,
> Andrew
>
> >
> > Andrew's suggestion might be more what I'm looking for.

I've now managed to dump the syntax tree of the compilation unit
(tree_function_decl.saved_tree -> tree_exp.operands ->
tree_statement_list.nodes). Thank you very much for the help!

In order to print out the original code, I need to know which program
code was translated to the individual nodes. Is there a chance to get
the original tokens (or the offsets in the program code file) per
tree_node without modifying the parser?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Mapping of TREE_CODE to tree_node
  2023-08-11 23:30       ` Aaron Lorey
@ 2023-08-15 20:00         ` Jason Merrill
  0 siblings, 0 replies; 7+ messages in thread
From: Jason Merrill @ 2023-08-15 20:00 UTC (permalink / raw)
  To: Aaron Lorey; +Cc: Andrew Pinski, David Malcolm, gcc

[-- Attachment #1: Type: text/plain, Size: 3220 bytes --]

On Fri, Aug 11, 2023 at 7:31 PM Aaron Lorey via Gcc <gcc@gcc.gnu.org> wrote:

> Am Mo., 3. Juli 2023 um 02:50 Uhr schrieb Andrew Pinski <pinskia@gmail.com
> >:
> >
> > On Sun, Jul 2, 2023 at 5:48 PM Aaron Lorey via Gcc <gcc@gcc.gnu.org>
> wrote:
> > >
> > > Am Mo., 26. Juni 2023 um 20:09 Uhr schrieb David Malcolm <
> dmalcolm@redhat.com>:
> > > >
> > > > On Mon, 2023-06-26 at 18:59 +0200, Aaron Lorey via Gcc wrote:
> > > > > Hello,
> > > > >
> > > > > this is the first time I am writing to a mailing list. I've tried
> > > > > researching the normal procedure but nothing special seems to be
> > > > > required.
> > > > >
> > > > > I'm currently trying to do a complete graph-discovery of GCC's
> symtab
> > > > > /
> > > > > tree_nodes to dump the full internal representation of the
> > > > > compilation
> > > > > unit. Gitlab: https://gitlab.com/graph-prog/code-database
> > > > >
> > > > > It is not exceptionally heavy but also not very easy to serialize
> the
> > > > > internal state to disk. I think this task was simply not considered
> > > > > in the
> > > > > design.
> > > > >
> > > > > Reason for writing to the mailing list are the troubles in
> connecting
> > > > > the
> > > > > TREE_CODE enumeration to the appropriate struct tree_node memory
> > > > > layout
> > > > > without guessing.
> > > > >
> > > > > Can you provide a mapping of TREE_CODE to tree_node memory layout?
> > > >
> > > > I don't know that such a mapping exists directly, but have a look at
> > > > the functions "tree_code_size" and "tree_size" defined in
> gcc/tree.cc.
> > > >
> > > > You might also find the LTO streaming code of interest; see gcc/lto-
> > > > streamer-{in,out}.cc
> > > >
> > > > Hope this is helpful
> > > > Dave
> > > >
> > > >
> > >
> > > Thank you for your reply.
> > >
> > > The tree_size() and tree_code_size() functions are useful, although
> incomplete.
> > >
> > > If I understand correctly, the link time optimization works on the
> > > GIMPLE representation. The original syntax tree and symbol table would
> > > be preferable.
> >
> > You could also look into the module support in the C++ front-end,
> > `gcc/cp/module.cc ` which does store out the original trees and such.
> >
> > Thanks,
> > Andrew
> >
> > >
> > > Andrew's suggestion might be more what I'm looking for.
>
> I've now managed to dump the syntax tree of the compilation unit
> (tree_function_decl.saved_tree -> tree_exp.operands ->
> tree_statement_list.nodes). Thank you very much for the help!
>
> In order to print out the original code, I need to know which program
> code was translated to the individual nodes. Is there a chance to get
> the original tokens (or the offsets in the program code file) per
> tree_node without modifying the parser?
>

Generally we try to track the corresponding source location for a lot of
things and attach them to the relevant tree nodes (EXPR_LOCATION,
DECL_SOURCE_LOCATION).  In many cases there is a lot of room for
improvement in this.  For instance, for a class, instead of just storing
the location of the name, we could remember the range from the class-key to
the closing brace.

Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-08-15 20:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-26 16:59 Mapping of TREE_CODE to tree_node Aaron Lorey
2023-06-26 17:45 ` Andrew Pinski
2023-06-26 18:08 ` David Malcolm
2023-07-03  0:46   ` Aaron Lorey
2023-07-03  0:50     ` Andrew Pinski
2023-08-11 23:30       ` Aaron Lorey
2023-08-15 20:00         ` Jason Merrill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).