public inbox for kawa@sourceware.org
 help / color / mirror / Atom feed
* Analyzing Scheme source code
@ 2020-12-08 23:49 Duncan Mak
  2020-12-09  2:32 ` Duncan Mak
  0 siblings, 1 reply; 4+ messages in thread
From: Duncan Mak @ 2020-12-08 23:49 UTC (permalink / raw)
  To: kawa mailing list

Hello all,

I'm interested in running an analysis of some Scheme source code,
specifically, I'm looking to find what's defined in each file and what
references each file takes on.

I started writing my own analyzer with the match macro, and it looks
something like this:

(define (process-form form)
  (match form
    (['define [name @args] @body]
     (cons name (map process-form body)))
    (['define name value]
     value)
    (['let [[foo bar] ...] @body]
     (map process-form body))
    (['if test then else]
     (list (process-form test)
           (process-form then)
           (process-form else)))
    ([procedure @args]
     (cons procedure (map process-form args)))))

Thinking a bit more, rather than doing it myself, I thought maybe I could
reuse the existing machinery that's in Kawa already, i.e.
https://www.gnu.org/software/kawa/internals/semantic-analysis.html

I think the trick is to get an instance of a Translator for a particular
file, and then call `rewrite` and possibly inspect the resulting Expression
(which ought to be an instance of a ModuleExp?)

(import (class gnu.expr Language NameLookup)
        (class gnu.kawa.io InPort)
        (class gnu.text Lexer)
        (class kawa.lang Translator))

(define (process-file filename)
  (let ((lang (Language:getDefaultLanguage))
        (lexer (language:getLexer (InPort:openFile filename)
(SourceMessages))))
    (Translator lang lexer:messages  (NameLookup lang))))

What I have above seems to only result in an empty Translator.

What's the right way to set up the environment?


-- 
Duncan.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Analyzing Scheme source code
  2020-12-08 23:49 Analyzing Scheme source code Duncan Mak
@ 2020-12-09  2:32 ` Duncan Mak
  2020-12-09  3:00   ` Duncan Mak
  2020-12-09  3:19   ` Per Bothner
  0 siblings, 2 replies; 4+ messages in thread
From: Duncan Mak @ 2020-12-09  2:32 UTC (permalink / raw)
  To: kawa mailing list

I played around some more and this now prints out all the declarations in a
file:

(import (class gnu.expr Declaration Language ModuleExp ModuleManager
NameLookup)
        (class gnu.kawa.io InPort)
        (class gnu.text Lexer SourceMessages)
        (class kawa.lang Translator))

(define (print-decls filename)
  (let* ((language   (Language:getDefaultLanguage))
         (port       (InPort:openFile filename))
         (lexer      (language:getLexer port (SourceMessages)))
         (manager    (ModuleManager:getInstance))
         (minfo      (manager:findWithSourcePath port:name))
         (translator (language:parse lexer Language:PARSE_IMMEDIATE minfo))
         (module ::ModuleExp (translator:currentModule)))
    (let loop ((decl ::Declaration (module:firstDecl)))
      (unless (eq? #!null decl)
        (format #t "~A~%" decl)
        (loop (decl:nextDecl))))))

On Tue, Dec 8, 2020 at 6:49 PM Duncan Mak <duncanmak@gmail.com> wrote:

> Hello all,
>
> I'm interested in running an analysis of some Scheme source code,
> specifically, I'm looking to find what's defined in each file and what
> references each file takes on.
>
> I started writing my own analyzer with the match macro, and it looks
> something like this:
>
> (define (process-form form)
>   (match form
>     (['define [name @args] @body]
>      (cons name (map process-form body)))
>     (['define name value]
>      value)
>     (['let [[foo bar] ...] @body]
>      (map process-form body))
>     (['if test then else]
>      (list (process-form test)
>            (process-form then)
>            (process-form else)))
>     ([procedure @args]
>      (cons procedure (map process-form args)))))
>
> Thinking a bit more, rather than doing it myself, I thought maybe I could
> reuse the existing machinery that's in Kawa already, i.e.
> https://www.gnu.org/software/kawa/internals/semantic-analysis.html
>
> I think the trick is to get an instance of a Translator for a particular
> file, and then call `rewrite` and possibly inspect the resulting Expression
> (which ought to be an instance of a ModuleExp?)
>
> (import (class gnu.expr Language NameLookup)
>         (class gnu.kawa.io InPort)
>         (class gnu.text Lexer)
>         (class kawa.lang Translator))
>
> (define (process-file filename)
>   (let ((lang (Language:getDefaultLanguage))
>         (lexer (language:getLexer (InPort:openFile filename)
> (SourceMessages))))
>     (Translator lang lexer:messages  (NameLookup lang))))
>
> What I have above seems to only result in an empty Translator.
>
> What's the right way to set up the environment?
>
>
> --
> Duncan.
>


-- 
Duncan.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Analyzing Scheme source code
  2020-12-09  2:32 ` Duncan Mak
@ 2020-12-09  3:00   ` Duncan Mak
  2020-12-09  3:19   ` Per Bothner
  1 sibling, 0 replies; 4+ messages in thread
From: Duncan Mak @ 2020-12-09  3:00 UTC (permalink / raw)
  To: kawa mailing list

Using Declarations got me somewhere, but really I'm interested in all the
top-level forms, and not just (define x ...).

For example, it'd be useful to see (define-record-type ...) forms too,
right now, I don't know how to get at them.

Another interesting question is macros that I define myself, I guess
somehow I'll have to tell the Language instance about my macros.

Also, I can't quite figure out how to open up the inside of a Declaration
either -- I'm looking for a getBody() method that might give me an
Expression[], but I haven't been able to find any methods of that sort.


Duncan.

On Tue, Dec 8, 2020 at 9:32 PM Duncan Mak <duncanmak@gmail.com> wrote:

> I played around some more and this now prints out all the declarations in
> a file:
>
> (import (class gnu.expr Declaration Language ModuleExp ModuleManager
> NameLookup)
>         (class gnu.kawa.io InPort)
>         (class gnu.text Lexer SourceMessages)
>         (class kawa.lang Translator))
>
> (define (print-decls filename)
>   (let* ((language   (Language:getDefaultLanguage))
>          (port       (InPort:openFile filename))
>          (lexer      (language:getLexer port (SourceMessages)))
>          (manager    (ModuleManager:getInstance))
>          (minfo      (manager:findWithSourcePath port:name))
>          (translator (language:parse lexer Language:PARSE_IMMEDIATE minfo))
>          (module ::ModuleExp (translator:currentModule)))
>     (let loop ((decl ::Declaration (module:firstDecl)))
>       (unless (eq? #!null decl)
>         (format #t "~A~%" decl)
>         (loop (decl:nextDecl))))))
>
> On Tue, Dec 8, 2020 at 6:49 PM Duncan Mak <duncanmak@gmail.com> wrote:
>
>> Hello all,
>>
>> I'm interested in running an analysis of some Scheme source code,
>> specifically, I'm looking to find what's defined in each file and what
>> references each file takes on.
>>
>> I started writing my own analyzer with the match macro, and it looks
>> something like this:
>>
>> (define (process-form form)
>>   (match form
>>     (['define [name @args] @body]
>>      (cons name (map process-form body)))
>>     (['define name value]
>>      value)
>>     (['let [[foo bar] ...] @body]
>>      (map process-form body))
>>     (['if test then else]
>>      (list (process-form test)
>>            (process-form then)
>>            (process-form else)))
>>     ([procedure @args]
>>      (cons procedure (map process-form args)))))
>>
>> Thinking a bit more, rather than doing it myself, I thought maybe I could
>> reuse the existing machinery that's in Kawa already, i.e.
>> https://www.gnu.org/software/kawa/internals/semantic-analysis.html
>>
>> I think the trick is to get an instance of a Translator for a particular
>> file, and then call `rewrite` and possibly inspect the resulting Expression
>> (which ought to be an instance of a ModuleExp?)
>>
>> (import (class gnu.expr Language NameLookup)
>>         (class gnu.kawa.io InPort)
>>         (class gnu.text Lexer)
>>         (class kawa.lang Translator))
>>
>> (define (process-file filename)
>>   (let ((lang (Language:getDefaultLanguage))
>>         (lexer (language:getLexer (InPort:openFile filename)
>> (SourceMessages))))
>>     (Translator lang lexer:messages  (NameLookup lang))))
>>
>> What I have above seems to only result in an empty Translator.
>>
>> What's the right way to set up the environment?
>>
>>
>> --
>> Duncan.
>>
>
>
> --
> Duncan.
>


-- 
Duncan.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Analyzing Scheme source code
  2020-12-09  2:32 ` Duncan Mak
  2020-12-09  3:00   ` Duncan Mak
@ 2020-12-09  3:19   ` Per Bothner
  1 sibling, 0 replies; 4+ messages in thread
From: Per Bothner @ 2020-12-09  3:19 UTC (permalink / raw)
  To: kawa

On 12/8/20 6:32 PM, Duncan Mak via Kawa wrote:
> I started writing my own analyzer with the match macro, and it looks
> something like this:
> 
> (define (process-form form)
>    (match form
>      (['define [name @args] @body]
>       (cons name (map process-form body)))
>   ... etc ...

That would not handle macros or imports, without a lot of effort.

> Thinking a bit more, rather than doing it myself, I thought maybe I could
> reuse the existing machinery that's in Kawa already,

I think that is a better way to go, though it would need internal undocumented APIs.

There is a non-documented kawa.expressions library (kawa/lib/kawa/expressions.scm),
though it is current mostly used for optimizing map and some other procedures.
See compile_map.scm and compile_misc.scm for uses.

To scan for uses, define a sub-class of ExpVisitor or ExpExpVisitor
looking for ReferenceExp.

For a more ambitious approach, one could beef up KawaLanguageServer,
and use existing clients/IDEs.

> For example, it'd be useful to see (define-record-type ...) forms too,
> right now, I don't know how to get at them.

Compiling a sample program with --debug-print-expr lets you see the
Expression tree generated for the program.

> Also, I can't quite figure out how to open up the inside of a Declaration
> either -- I'm looking for a getBody() method that might give me an
> Expression[], but I haven't been able to find any methods of that sort.

A Declaration is the "symbol" being defined.  The form in the source code
that does the defining is usually a SetExp (if at top-level).
It is possible to map from Declaration to defining expressions, but it
includes assignments (since it is used for optimization), and is rather
complicated.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-09  3:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-08 23:49 Analyzing Scheme source code Duncan Mak
2020-12-09  2:32 ` Duncan Mak
2020-12-09  3:00   ` Duncan Mak
2020-12-09  3:19   ` Per Bothner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).