public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* Re: (PR11207) Macroprocessor discussion
       [not found] <f967988c-d2f7-49a5-b29c-0201b125db42@zmail19.collab.prod.int.phx2.redhat.com>
@ 2012-07-05 15:54 ` Serguei Makarov
  2012-07-11 21:02   ` (PR11207) Macroprocessor discussion -- current safe-to-implement proposal Serguei Makarov
  0 siblings, 1 reply; 3+ messages in thread
From: Serguei Makarov @ 2012-07-05 15:54 UTC (permalink / raw)
  To: systemtap

Frank,

Thanks for taking the time to offer feedback. There's really one major issue that's blocking for the design, which I discuss below. (As for the docstring-manipulation stuff not being appealing -- it's all right, that won't be in the initial prototype anyway, and we can live without it if I can't think of a nicer way to get the preprocessor to do what I wanted from that feature...)

> This is a problem.  One of the expected uses of this mechanism was to
> let tapsets define macros for use by other scripts, such as for performing
> kernel-flavoured offset_of(), container_of() type operations.

Hm, based on my reading of the code I felt that implementing this makes for a thorny problem with the way tapset inclusion is currently arranged. From what I know, identifiers across files are resolved as follows:
- all of the tapset files are parsed independently of each other (see main.cxx:L) and collected in s.library_files (this implies that macroexpansion is performed on each one)
- during semantic analysis, we resolve identifiers by finding the files containing referenced identifiers and adding them to s.files, then checking which tapsets *those* import in a transitive closure sorta thing
- everything that ends up in the transitive closure is compiled into the final module

This is a chicken-and-egg problem, since tapsets obviously are allowed to build on one another, and to properly parse and macroexpand a file we need to know what macros are in other files, so we need to macroexpand those files, which requires knowing what macros are pulled in from other files for *that* file...

One hack I can think of for getting around the issue is to force tapset writers to explicitly put directives such as

%include_macros("other_tapset.stp")

if they want to use macros from that tapset, in order for stap to have the information to parse files in the correct order. (The most obvious solution for this requires the parser to be able to suspend parsing one file, go off and parse another file, and then return to the first one where it left off. This seems reasonably, but some double-checking is required to be sure that nothing in parse.cxx assumes files are handled one at a time...)

[There was originally a digression here about how much time is spent parsing the tapset files sequentially, but it's not that much time as a proportion of the script run, so the efficiency concerns I was thinking about are basically irrelevant. (They may be applicable to Pass 2, but that's a discussion for a different day.) And pass timings are automatically printed for verbose > 0. :-)]

The final script being compiled is exempted from the need to use explicit macro inclusion directives, since by that point there is no chicken-and-egg ambiguity.

(As a minor detail, we also have to flip the order of passes 1a and 1b. 1b parses the tapsets, while 1a parses the final script.)

That's not a very satisfying solution. However, unless I think of something else, our design space for this corner of the problem seems to be very limited...

- Serhei

(PS: Argh, I keep forgetting to use reply to all...)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: (PR11207) Macroprocessor discussion -- current safe-to-implement proposal
  2012-07-05 15:54 ` (PR11207) Macroprocessor discussion Serguei Makarov
@ 2012-07-11 21:02   ` Serguei Makarov
  2012-07-11 21:06     ` Serguei Makarov
  0 siblings, 1 reply; 3+ messages in thread
From: Serguei Makarov @ 2012-07-11 21:02 UTC (permalink / raw)
  To: systemtap

This is the proposal I intend to implement in the next little
while (unless anyone has any last-second comments or objections).
It is designed to do about 80% of what we ideally want the
macroprocessor to do; the other 20% requires a fairly powerful engine,
which could conceivably be developed later and adapted to use the same
syntax as the current proposal.

Basics
- Token based preprocessor is housed in parse.cxx.
- The parser filters its input through the preprocessor before analysing it.
- Docstrings cannot be manipulated or generated with macros; any
  functionality for this is delayed until a later point. (We will
  either develop a separate text-based mechanism, some kind of
  super-general-uber-engine, or just procrastinate on the issue
  forever.)

Definition Syntax
- one-line macro definition @define foo(...) ...body...
- multiline macro definition @define foo(...) %( ...body... %)
  - %( %) brackets inside the body (e.g. for stap conditionals) must balance
- optional, for later: heredoc-type macros @define foo(...) <<HERE
  - macro body then continues until we find a line that says 'HERE'
    (or whatever we put, e.g. 'END')

Invocation Syntax
- @macro_name(param,param,...).
  - ( ) brackets inside the parameters must balance, so we can pass in
    complex expressions such as @foo(function(a,b,c)) and have it do
    the right thing.
- Unknown macros are passed to the parser without being expanded or
  causing an error. This enables us to continue using constructs such
  as @cast(...) transparently -- they are simply ignored.

Cross-Tapset Inclusion of Macros
- Deferred until later (macro definitions will start out local to one
  file), but promising options include:
  - Create a file, say, tapset/foo.stpm or tapset/macros/foo.stpm,
    which defines a macro 'foo' (must be same name as the file) that
    we want to be available globally.
  - Create a file, say, tapset/common.stpm. The preprocessor uses the
    macro definitions from this file in *every* other file it parses.
  - Wait for the special-rainbows-and-unicorns statically scoped
    engine to solve the problem for us.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: (PR11207) Macroprocessor discussion -- current safe-to-implement proposal
  2012-07-11 21:02   ` (PR11207) Macroprocessor discussion -- current safe-to-implement proposal Serguei Makarov
@ 2012-07-11 21:06     ` Serguei Makarov
  0 siblings, 0 replies; 3+ messages in thread
From: Serguei Makarov @ 2012-07-11 21:06 UTC (permalink / raw)
  To: systemtap

Just to clarify -- the point of considering heredoc-type macros is that brackets inside the macro body don't have to balance -- the only thing that can close the macro is the heredoc-marker. Thus we can use them to emit constructs with unbalanced %( %) brackets.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-07-11 21:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <f967988c-d2f7-49a5-b29c-0201b125db42@zmail19.collab.prod.int.phx2.redhat.com>
2012-07-05 15:54 ` (PR11207) Macroprocessor discussion Serguei Makarov
2012-07-11 21:02   ` (PR11207) Macroprocessor discussion -- current safe-to-implement proposal Serguei Makarov
2012-07-11 21:06     ` Serguei Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).