public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* C/C++ codes searching/browsing
@ 2000-07-19 20:09 Habib Khalfallah
  2000-07-20 16:09 ` Bruce Stephens
  0 siblings, 1 reply; 6+ messages in thread
From: Habib Khalfallah @ 2000-07-19 20:09 UTC (permalink / raw)
  To: gcc

Hello,

I would like to enquire if any development of a C/C++ 
code indexing/browsing facility has been or is in the
process of being done. This facility would use the
front end of the gcc compiler to generate an index of
the source that facilitates searches of the source
base.

regards,

Habib Khalfallah


_______________________________________________________
Do You Yahoo!?
Get your free @yahoo.ca address at http://mail.yahoo.ca

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: C/C++ codes searching/browsing
  2000-07-19 20:09 C/C++ codes searching/browsing Habib Khalfallah
@ 2000-07-20 16:09 ` Bruce Stephens
  0 siblings, 0 replies; 6+ messages in thread
From: Bruce Stephens @ 2000-07-20 16:09 UTC (permalink / raw)
  To: Habib Khalfallah; +Cc: gcc

Habib Khalfallah <hkhalfallah@yahoo.ca> writes:

> I would like to enquire if any development of a C/C++ code
> indexing/browsing facility has been or is in the process of being
> done. This facility would use the front end of the gcc compiler to
> generate an index of the source that facilitates searches of the
> source base.

cxref uses a modified gcc front-end, I think---the C parser, anyway.

There was brief discussion a week or so ago on this mailing list about
adding something like this to support SWIG.  It was suggested that it
probably wouldn't be too hard to do and to maintain, and some of the
code could be shared across front-ends.  I've no idea whether anyone's
working on it.

A couple of non-gcc solutions are SDS
(<URL: http://sds.sourceforge.net/ >) which uses an OpenC++-based
parser, and Source Navigator
(<URL: http://sources.redhat.com/sourcenav/ >, if memory serves) which
has its own parser.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: C/C++ codes searching/browsing
  2000-08-02 12:10 Brad King
  2000-08-02 14:46 ` Artem Khodush
@ 2000-08-03 11:17 ` Bruce Stephens
  1 sibling, 0 replies; 6+ messages in thread
From: Bruce Stephens @ 2000-08-03 11:17 UTC (permalink / raw)
  To: gcc

Brad King <brad.king@kitware.com> writes:

> A few weeks ago I noticed a message on this list asking about a tool to do
> C++ code searching/browsing.  While there were a few third-party tools
> listed in a response to this message, there doesn't seem to be a general
> purpose solution.

Well, there's <URL: http://sources.redhat.com/sourcenav/ >.  It's not a
real compiler, though, so probably something based on gcc could do
better in some respects---it would be cool to have a choice of
C/C++/Java parsers, regardless.

[...]

> Source file/line number element attributes could be added to allow
> these tools to go back to the source and extract comments
> corresponding to each definition.

Source Navigator seems to require column numbers, too.  Apparently it
uses them for syntax highlighting.  They probably aren't critical, but
it would be nice if they could be provided.  I think this would be
hard, unfortinately---ir.texi only mentions line numbers.

[...]

> I would appreciate any comments or suggestions on this, especially
> about the format of the XML output.  Also, is anyone else working on
> such a project, even if using a different output format?

Check out SDS, <URL: http://sourceforge.net/projects/sds/ >.  They have
a variety of parsers and tools, and the intermediate language is XML
(or gzipped XML, or something---maybe an XML-based database).  They've
been developing this for at least a year or two, so probably the DTD
is relatively well thought out, and suitable for a variety of
languages, which would be convenient.

[...]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: C/C++ codes searching/browsing
  2000-08-02 12:10 Brad King
@ 2000-08-02 14:46 ` Artem Khodush
  2000-08-03 11:17 ` Bruce Stephens
  1 sibling, 0 replies; 6+ messages in thread
From: Artem Khodush @ 2000-08-02 14:46 UTC (permalink / raw)
  To: Brad King, gcc

Brad King <brad.king@kitware.com> wrote:

> A few weeks ago I noticed a message on this list asking about a tool to do
> C++ code searching/browsing.  While there were a few third-party tools
> listed in a response to this message, there doesn't seem to be a general
> purpose solution.

Well, I think that the best (open-source) approximation to a "general purpose 
solution for code searching/browsing" currently is source navigator
( http://sources.redhat.com/sourcenav/ ), released just a few weeks ago.

> I am working on an addition to GCC's C++ front-end that adds a compiler
> flag (-fxml) which triggers special processing at the end of a translation
> unit.  Eventually, this processor will walk the entire translation unit,
> starting at the global namespace (or perhaps a specified scope name?),
> and write out an XML file describing the contents of each namespace/class
> scope.  These files could then be read by any tool for any purpose,
> freeing these tools of the nasty task of parsing C++.  Source
> file/line number element attributes could be added to allow these tools
> to go back to the source and extract comments corresponding to each
> definition.

Speaking of browsing code, parsing C++ is not the hardest thing, especially
when you have g++ source code with wonderfully documented tree structure.
Ideally, you need cross-referencing information for all source files 
which went to the particular executable image, collected in one place. 
So that you can go directly from a call to a function to its definition in another 
file. And ideally, the browsing tool should not ask you 
"which of a set of overloaded definitions of foo would you like to go to", 
because compiler already did overload resolution for every call, and that 
information should be recorded in xref database. 

I think the best way to handle this is to create new gcc front-end, which
does cross-reference generation instead of compiling, and merges xref
info for different source files instead of linking. That way, you just
"make CC=new-front-end" for your project, and have relevant xref databases
built for all your executable images, with all linkage correctly resolved.
And I think that XML is not the best storage format for xref information. 
First, you need fast lookup during code browsing. Second, there might be 
need to update xref database incrementally for this tool to be usable for large 
projects. So some kind of "real" database may be needed for storage instead 
of XML files (this is the approach taken in the source-navigator).

Separation of this feature into new front-end will simplify (i hope) integration 
into the gcc distribution and maintainance. Sure, just adding new switch 
to gcc is much easier, but I suspect that the amount of code implementing 
this functionality will eventually grow to some unpleasant size. 

> I would like to know if this is something that could be added to GCC
> permanently, and thus be included in the distribution and maintained
> as GCC is updated.  Obviously we would have to agree on a DTD for the
> XML output.  The example below demonstrates an approach I've been thinking
> about.  It uses a minimal amount of XML's constructs, but something more
> advanced could be designed if it is appropriate.
> 
> I would appreciate any comments or suggestions on this, especially about
> the format of the XML output.  Also, is anyone else working on such a
> project, even if using a different output format?

I was thinking of implementing a new parser for the source-navigator, based
on gcc (specifically, on g++).

The XML output as it was, i think was unsutable for browsing. For each entity
referenced in a function body, you need to record exact position of reference in 
the source code, and provide a direct link (not just a name) to its declaration
(and possibly definition). Also, in real life, you should take into account 
macros (but that's whole another story).

Best regards,
Artem.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: C/C++ codes searching/browsing
@ 2000-08-02 12:45 Anjul Srivastava
  0 siblings, 0 replies; 6+ messages in thread
From: Anjul Srivastava @ 2000-08-02 12:45 UTC (permalink / raw)
  To: 'Brad King', gcc

This is a nice direction to go in and a much needed utility.

Let me suggest a generalization.

First, let's not think about XML, and instead focus on what precisely is the
information that we wish to provide. Let's consider the possibilities. I
will use a modified XML notation below, where instead of <tag>content</tag>
I will write <tag>(content)

The first and simplest possibility is to provide token information. For
example, <keyword-int>(int), <integer constant value=50>(50), ... This would
be useful for a pretty-printer but it has serious limitations for anything
more significant.

The second level is a generalization of the first to provide grammar
information. For example,
<statement>(<expression-statement>(<lvalue>(x)<assign-op>(=)<integer
constant>(50)<semi-colon>(;))) This information is very useful in asking
almost any question, say where is "x" used as an lvalue in function "foo"

The third level could be a specialization of the second level, that would
provide output with a less complex structure than the grammar, thus making
it easier to use. For example: <function-call name="foo" nargs=5
typelist="int, float, float, int, int">(...[second-level analysis of the
statement "foo(1,7,1.25,3.14,8,1024)"])

There are a lot of design choices here and my suggestion is to think really
hard about what information we want to provide (and we can provide
efficiently), then think of a good notation that is amenable to further
processing. We could have a filter that could reformat the information
generated into XML according to some DTD, but let's not let the notation and
ways of XML restrict the kind of information we provide. My recommendation
is the second level with filters to produce third level output and XML
output. The third-level filters may be developed later on a continuing
basis.

Of course, what actually is implemented ultimately is the choice of the
implementor, so if it is more convenient for Brad to build an XML outputting
program for whatever reason, so be it. Something is better than nothing.

Thank you very much, Brad!

-----Original Message-----
From: Brad King [ mailto:brad.king@kitware.com ]
Sent: Wednesday, August 02, 2000 2:09 PM
To: gcc@gcc.gnu.org
Subject: Re: C/C++ codes searching/browsing


Hello Everyone,

A few weeks ago I noticed a message on this list asking about a tool to do
.
.
.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: C/C++ codes searching/browsing
@ 2000-08-02 12:10 Brad King
  2000-08-02 14:46 ` Artem Khodush
  2000-08-03 11:17 ` Bruce Stephens
  0 siblings, 2 replies; 6+ messages in thread
From: Brad King @ 2000-08-02 12:10 UTC (permalink / raw)
  To: gcc

Hello Everyone,

A few weeks ago I noticed a message on this list asking about a tool to do
C++ code searching/browsing.  While there were a few third-party tools
listed in a response to this message, there doesn't seem to be a general
purpose solution.

I am working on an addition to GCC's C++ front-end that adds a compiler
flag (-fxml) which triggers special processing at the end of a translation
unit.  Eventually, this processor will walk the entire translation unit,
starting at the global namespace (or perhaps a specified scope name?),
and write out an XML file describing the contents of each namespace/class
scope.  These files could then be read by any tool for any purpose,
freeing these tools of the nasty task of parsing C++.  Source
file/line number element attributes could be added to allow these tools
to go back to the source and extract comments corresponding to each
definition.

I would like to know if this is something that could be added to GCC
permanently, and thus be included in the distribution and maintained
as GCC is updated.  Obviously we would have to agree on a DTD for the
XML output.  The example below demonstrates an approach I've been thinking
about.  It uses a minimal amount of XML's constructs, but something more
advanced could be designed if it is appropriate.

I would appreciate any comments or suggestions on this, especially about
the format of the XML output.  Also, is anyone else working on such a
project, even if using a different output format?

Thanks,
-Brad

Here is an example with some C++ code, and XML output of its structure.
I just made up this example off the top of my head while writing this
message, so this is not intended as an example of real use.  The
format of the XML is similar to that produced by a preliminary
implementation I've added to my own copy of GCC's C++ front end.


Example C++ code:

namespace N
{
  class X
  {
  public:
    X(const int& x): m_Int(x) {}
    void Print(ostream& os) { os << m_Int; }
    int Get(void) const { return m_Int; }
    operator int() { return m_Int; }
    X& operator += (const X& r) { m_Int += r.m_Int; return *this; }
    bool operator < (const X& r) const { return (m_Int < r.m_Int); }
    static const char* GetClassName(void) { return "X"; }
  private:
    int m_Int;
  };

  X* Make_X(int x) { return new X(x); }
}


Resulting XML output (minus headers):

<Namespace name="N" id_name="_1N">
  <Class name="C" id_name="_1C" sourcefile="X.h" sourceline="3">
    <Constructor name="X" argc="1" access="public">
      <Argument type="int&" cv="const" name="x"></Argument>
    </Constructor>
    <Method name="Print" argc="1" access="public" static="0" cv="">
      <Argument type="ostream&" cv="" name="os"></Argument>
      <Returns type="void" cv=""></Returns>
    </Method>
    <Method name="Get" argc="0" access="public" static="0" cv="const">
      <Returns type="int" cv=""></Returns>
    </Method>
    <Converter name="operator int" access="public">
      <Returns type="int" cv=""></Returns>
    </Converter>
    <Operator name="+=" argc="1" access="public" id_name="__apl" cv="">
      <Argument type="X&" cv="const" name="r"></Argument>
      <Returns type="X&" cv=""></Returns>
    </Operator>
    <Operator name="&#x003C;" argc="1" access="public" id_name="__lt"
              cv="const">
      <Argument type="X&" cv="const" name="r"></Argument>
      <Returns type="bool" cv=""></Returns>
    </Operator>
    <Method name="GetClassName" argc="0" access="public" static="1" cv="">
      <Returns type="char*" cv="const"></Returns>
    </Method>
    <DataMember name="m_Int" type="int" access="private" static="0" cv="">
    </DataMember>
  </Class>
  <Function name="Make_X" argc="1" static="0">
    <Argument type="int" cv="" name="x"></Argument>
    <Returns type="X*" cv=""></Returns>
  </Function>
</Namespace>


If you care why I'm interested in doing this:

FYI, my motivation for creating this tool is that I'm writing a new tool
to generate wrappers for C++ programs in interpreted languages.  SWIG is
good for C, but lacks support for much of C++.  This tool will be used to
wrap classes in an open project I'm involved with for the National
Laboratory of Medicine.  Many of the classes are templated, so preselected
specializations will be wrapped.  I experimented with several parsers
to deal with the templated C++ code, but soon realized that GCC's
front-end did all the parsing and specialization work already.  This
XML-output addition to the front-end seems to be a solution that does what
I need, and can be used by others as well.



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2000-08-03 11:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-07-19 20:09 C/C++ codes searching/browsing Habib Khalfallah
2000-07-20 16:09 ` Bruce Stephens
2000-08-02 12:10 Brad King
2000-08-02 14:46 ` Artem Khodush
2000-08-03 11:17 ` Bruce Stephens
2000-08-02 12:45 Anjul Srivastava

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).