public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* GCC plugins: mapping AST back to source files, missing locations.
@ 2012-10-18 20:08 mefyl
  2012-10-19  0:19 ` Ian Lance Taylor
  0 siblings, 1 reply; 3+ messages in thread
From: mefyl @ 2012-10-18 20:08 UTC (permalink / raw)
  To: gcc-help

Hi,

I've been playing with GCC plugins for a few weeks for static code analysis 
and source-to-source refactoring, and I figured a good first exercise would be 
to extract variable bindings, i.e. output a map that binds 
types/functions/variables usages to their declaration, with source code 
locations. A semantic etags, one may say - I know such tools already exist.

I had a hard time figuring everything out due to the lack of documentation, 
but thanks to headers and helpful blog posts I am now able to walk the AST and 
registers entity declarations and their usages. However I have a few problem I 
can't seem to solve on my own despite intense googling, here are the two main 
ones:

* I can retrieve source file location of declarations with 
  DECL_SOURCE_LOCATION without any problem, but I can't get locations for 
  expressions since their EXPR_LOCATION is always null. I used the 
  PLUGIN_PRE_GENERICIZE hook, which AFAIK is the earliest, so I expected 
  locations to still be there. Did I miss something, or is there no way to
  retrieve such locations from plugins, in which case my example seems 
  impossible ?
* Even if we suppose I fixed the previous issue, from what I inferred from the 
  AST, a reference to a variable is directly represented by the corresponding 
  VAR_DECL. That is, in "int x; int y; x + y;", "x + y" is a PLUS_EXPR whose 
  operands are the VAR_DECL corresponding to "int x" and "int y". This would 
  make it impossible for me to get the location of "x" and "y", since they are
  not represented as such by nodes in the AST.

I fear that due to the compiler nature of GCC, the AST I'm working on is 
altered, with some processing and simplification already applied, preventing 
source-to-source code transformation since one cannot map ast chunks back to 
source files. Could anyone confirm or point me in some direction if I'm wrong 
? In the case it's indeed impossible, is there any intent to improve this 
aspect, like giving access to earlier stages of the AST (freshly parsed, with 
only variable bindings for instance) ? This would enable to write extremely 
powerfull analysis & refactoring tools.

Thanks in advance for your help,

-- 
mefyl

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: GCC plugins: mapping AST back to source files, missing locations.
  2012-10-18 20:08 GCC plugins: mapping AST back to source files, missing locations mefyl
@ 2012-10-19  0:19 ` Ian Lance Taylor
  2012-10-22 12:31   ` mefyl
  0 siblings, 1 reply; 3+ messages in thread
From: Ian Lance Taylor @ 2012-10-19  0:19 UTC (permalink / raw)
  To: mefyl; +Cc: gcc-help

On Thu, Oct 18, 2012 at 9:34 AM, mefyl <mefyl@gruntech.org> wrote:
>
> * I can retrieve source file location of declarations with
>   DECL_SOURCE_LOCATION without any problem, but I can't get locations for
>   expressions since their EXPR_LOCATION is always null. I used the
>   PLUGIN_PRE_GENERICIZE hook, which AFAIK is the earliest, so I expected
>   locations to still be there. Did I miss something, or is there no way to
>   retrieve such locations from plugins, in which case my example seems
>   impossible ?

Most expressions should have locations these days.  It is certainly
true that there will be some that do not.  Hard to say more without
more details.

> * Even if we suppose I fixed the previous issue, from what I inferred from the
>   AST, a reference to a variable is directly represented by the corresponding
>   VAR_DECL. That is, in "int x; int y; x + y;", "x + y" is a PLUS_EXPR whose
>   operands are the VAR_DECL corresponding to "int x" and "int y". This would
>   make it impossible for me to get the location of "x" and "y", since they are
>   not represented as such by nodes in the AST.

That is correct.

> I fear that due to the compiler nature of GCC, the AST I'm working on is
> altered, with some processing and simplification already applied, preventing
> source-to-source code transformation since one cannot map ast chunks back to
> source files. Could anyone confirm or point me in some direction if I'm wrong
> ? In the case it's indeed impossible, is there any intent to improve this
> aspect, like giving access to earlier stages of the AST (freshly parsed, with
> only variable bindings for instance) ? This would enable to write extremely
> powerfull analysis & refactoring tools.

Unfortunately the GCC AST almost certainly does not retain enough
information for the kind of transformations you want to do.  This is a
deficiency in the AST, but I can't even estimate how difficult it
would be to fix.

Ian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: GCC plugins: mapping AST back to source files, missing locations.
  2012-10-19  0:19 ` Ian Lance Taylor
@ 2012-10-22 12:31   ` mefyl
  0 siblings, 0 replies; 3+ messages in thread
From: mefyl @ 2012-10-22 12:31 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-help

On Thursday 18 October 2012 11:06:24 Ian Lance Taylor wrote:
> On Thu, Oct 18, 2012 at 9:34 AM, mefyl <mefyl@gruntech.org> wrote:
> > I fear that due to the compiler nature of GCC, the AST I'm working on is
> > altered, with some processing and simplification already applied,
> > preventing source-to-source code transformation since one cannot map ast
> > chunks back to source files. Could anyone confirm or point me in some
> > direction if I'm wrong ? In the case it's indeed impossible, is there any
> > intent to improve this aspect, like giving access to earlier stages of
> > the AST (freshly parsed, with only variable bindings for instance) ? This
> > would enable to write extremely powerfull analysis & refactoring tools.
> 
> Unfortunately the GCC AST almost certainly does not retain enough
> information for the kind of transformations you want to do.  This is a
> deficiency in the AST, but I can't even estimate how difficult it
> would be to fix.

Thank you for your answer, that saved me a lot of time and struggle. Digging a 
bit further I see that some other constructions are already simplified or even 
trimmed in the AST, so mapping back to source seems indeed not reasonably 
feasible. This perfectly make sense given GCC's background and primary goals, 
though. I'll give it a try with CLang plugins.

Cheers,

-- 
mefyl

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-10-22  8:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-18 20:08 GCC plugins: mapping AST back to source files, missing locations mefyl
2012-10-19  0:19 ` Ian Lance Taylor
2012-10-22 12:31   ` mefyl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).