* [wwwdocs] projects/strees/index.html -- convert to HTML 5
@ 2018-09-02 13:21 Gerald Pfeifer
0 siblings, 0 replies; only message in thread
From: Gerald Pfeifer @ 2018-09-02 13:21 UTC (permalink / raw)
To: gcc-patches
Committed.
Gerald
Convert to HTML 5: Use <code> instead of <tt>, omit align="center",
use CSS.
Index: projects/strees/index.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/strees/index.html,v
retrieving revision 1.4
diff -u -r1.4 index.html
--- projects/strees/index.html 1 Sep 2018 23:42:10 -0000 1.4
+++ projects/strees/index.html 2 Sep 2018 13:01:25 -0000
@@ -4,23 +4,23 @@
<link rel="stylesheet" type="text/css" href="https://gcc.gnu.org/gcc.css" /></head>
<body>
-<h1 align="center">Stree design notes</h1>
-<p align="center"><font size="-1">
+<h1>Stree design notes</h1>
+<p class="smaller">
<i>Matt Austern, Robert Bowdidge, Geoff Keating</i>
-</font></p>
+</p>
<p>The stree project is based on three fundamental premises. First:
for an important class of development tasks (roughly: GUI programs
-written in a relatively simple subset of C++, compiled at <tt>-O0
--g</tt>), compilation time is dominated by the C++ front end. Second:
+written in a relatively simple subset of C++, compiled at <code>-O0
+-g</code>), compilation time is dominated by the C++ front end. Second:
the performance of the C++ front end is dominated by memory allocation
and management. This includes memory allocation, initializing newly
allocated objects, and bookkeeping for garbage collection. Reducing
front end memory usage should thus improve front end
performance. Third: many programs consist of small source files that
include truly enormous header files. Such header files
-include <tt><iostream></tt> (25,000 lines),
-Apple's <tt><Carbon/Carbon.h></tt> (91,000 lines), and the X11
+include <code><iostream></code> (25,000 lines),
+Apple's <code><Carbon/Carbon.h></code> (91,000 lines), and the X11
headers. Any given translation unit only uses a tiny fraction of the
declarations in one of these headers.</p>
@@ -47,23 +47,23 @@
generate strees.</li>
<li>Usually we generate strees only for declarations that are not
also definitions. So, for example, we would generate an stree
- for <tt>void do_nothing();</tt> (which the middle end and back
+ for <code>void do_nothing();</code> (which the middle end and back
end don't necessarily have to know about), but we would generate
- a full tree for <tt>void >do_nothing() { }</tt> (which the
+ a full tree for <code>void >do_nothing() { }</code> (which the
middle end has to expand to RTL).</li>
<li>For a declaration that is not a definition, there is a simple way
to characterize whether or not the definition is "needed": some
other declaration refers to it by name. For example: if a
- function <tt>xyzzy</tt> is declared but nobody ever defines it,
+ function <code>xyzzy</code> is declared but nobody ever defines it,
takes its address, or calls a function with that name, then
that declaration isn't needed, and generating a decl tree for
it is wasted time and space.</li>
<li>This definition of "needed" immediately leads to an
implementation technique. References to a decl by name always
- go through a <tt>cxx_binding</tt>
- object. (See <tt>name_lookup.h</tt>). So we just need to
- make <tt>cxx_binding</tt> a little bit more complicated: when
- we ask a <tt>cxx_binding</tt> for the value of the binding, we
+ go through a <code>cxx_binding</code>
+ object. (See <code>name_lookup.h</code>). So we just need to
+ make <code>cxx_binding</code> a little bit more complicated: when
+ we ask a <code>cxx_binding</code> for the value of the binding, we
check to see whether we have a tree or an stree. If the
latter, we expand it into a tree and cache the expanded
version.</li>
@@ -78,7 +78,8 @@
implementation reason why we don't want to defer emission of
diagnostics. Early diagnostics reduce the need for global
state to be remembered for each stree: source code
- position, <tt>current_binding_level</tt>, <tt>current_class_ptr</tt>,
+ position, <code>current_binding_level</code>,
+ <code>current_class_ptr</code>,
etc.)</li>
</ul>
@@ -86,41 +87,42 @@
<h2>An example: enumerators</h2>
<p>Consider the front end data structure for a simple enumeration
-declaration, <tt>enum foo { a, b };</tt>. We have two enumerators. For
-each one we need to know its name, its type, the underlying integer
+declaration, <code>enum foo { a, b };</code>. We have two enumerators.
+For each one we need to know its name, its type, the underlying integer
type used to represent it, and its value. At present we represent
-enumerators with <tt>CONST_DECL</tt> nodes, so each enumerator takes
-128 bytes for the <tt>tree_decl</tt> node, plus additional memory
-for <tt>cp-tree.h</tt>'s version of <tt>lang_decl</tt>.</p>
+enumerators with <code>CONST_DECL</code> nodes, so each enumerator takes
+128 bytes for the <code>tree_decl</code> node, plus additional memory
+for <code>cp-tree.h</code>'s version of <code>lang_decl</code>.</p>
<p>Each enumerator has an entry in the hash table, an identifier. Each
-identifier has a pointer to a binding of type <tt>cxx_binding</tt>
-(this is the bindings field in <tt>lang_identifier</tt>, defined in
-<tt>name_lookup.h</tt>). The binding for <tt>foo</tt> itself points to
-a <tt>tree_type</tt>, and the bindings for <tt>a</tt> and <tt>b</tt>
-point to <tt>CONST_DECL</tt> nodes. Each <tt>CONST_DECL</tt> node has
-pointers to the name and to the <tt>ENUMERAL_TYPE</tt> node, and
+identifier has a pointer to a binding of type <code>cxx_binding</code>
+(this is the bindings field in <code>lang_identifier</code>, defined in
+<code>name_lookup.h</code>). The binding for <code>foo</code> itself
+points to a <code>tree_type</code>, and the bindings for <code>a</code>
+and <code>b</code>
+point to <code>CONST_DECL</code> nodes. Each <code>CONST_DECL</code> node
+has pointers to the name and to the <code>ENUMERAL_TYPE</code> node, and
additionally has a pointer to a node representing the enumerator's
value. In simple examples like this one each enumerator's value is
-an <tt>INTEGER_CST</tt>, giving us another 36 bytes
-each. (An <tt>INTEGER_CST</tt> node contains a <tt>tree_common</tt>
+an <code>INTEGER_CST</code>, giving us another 36 bytes
+each. (An <code>INTEGER_CST</code> node contains a <code>tree_common</code>
subobject, with all the generality that implies.)</p>
<p>We don't need 200 bytes to represent the fact that the
-enumerator <tt>a</tt> has the value 0. First: as an stree it's
+enumerator <code>a</code> has the value 0. First: as an stree it's
unnecessary to store a pointer to the name of this enumerator. The
-stree will only be accessed via a <tt>cxx_binding</tt>, so any code
+stree will only be accessed via a <code>cxx_binding</code>, so any code
that accesses the stree already knows the name. Second: it isn't
-necessary to use anything so large as an <tt>INTEGER_CST</tt> to
+necessary to use anything so large as an <code>INTEGER_CST</code> to
represent the value "0". Most of the information stored in
-an <tt>INTEGER_CST</tt> (chain nodes, type pointers, etc.) is
+an <code>INTEGER_CST</code> (chain nodes, type pointers, etc.) is
unnecessary, since we already know we're getting to the value through
an enumerator. We only need to store two pieces of information: the
enumeration that this enumerator is associated with, and its initial
value. This allows us to represent the enumerator in six bytes: a
one-byte code for the type of the stree (specifically:
-the <tt>TREE_CODE</tt> of the tree that this stree corresponds to),
+the <code>TREE_CODE</code> of the tree that this stree corresponds to),
four bytes (a pointer or the equivalent) for the enumeration, and one
byte for the value. Note that this implies a variable-width encoding
for the integer values; some enumerations will require seven or more
@@ -132,7 +134,7 @@
scope might have values that depend on template parameters, meaning
that we can't necessarily represent the values as simple
integers. Neither is a serious problem. Because
-a <tt>cxx_binding</tt>'s value can be either a tree or stree, we can
+a <code>cxx_binding</code>'s value can be either a tree or stree, we can
use strees for the common, simple cases, and default to trees
otherwise. Because strees are a variable-sized representation, we can
add additional values needed for building trees for the complex case
@@ -141,7 +143,7 @@
<h2>Some implementation details</h2>
<p>The stree data structure itself is defined
-in <tt>stree.[ch]</tt>. Strees are tightly-packed, serialized
+in <code>stree.[ch]</code>. Strees are tightly-packed, serialized
representations of simple declarations</p>
@@ -161,72 +163,73 @@
<p>Clients
access stree data via an <i>iterator</i>: given an
-stree with index <tt>s</tt>, the function <tt>get_s_tree_iter</tt> (declared in
-<tt>stree.h</tt>) creates an iterator pointing to the beginning
-of <tt>s</tt>. Other functions declared in <tt>stree.h</tt> access the
+stree with index <code>s</code>, the function <code>get_s_tree_iter</code> (declared in
+<code>stree.h</code>) creates an iterator pointing to the beginning
+of <code>s</code>. Other functions declared in <code>stree.h</code> access the
iterator to extract each serialized value in turn. This scheme allows
us to store data in the most compressed representation possible, and
in a way such that clients are insulated from the details of the
representation. For enumerators, for example, instead of using a
-full <tt>INTEGER_CST</tt> for each value, we can use one or two bytes
+full <code>INTEGER_CST</code> for each value, we can use one or two bytes
in the (typical) case where the values are small.</p>
-<p>Strees are created with <tt>build_s_tree</tt>, a varargs function
-defined in <tt>stree.c</tt>. Its first argument is the stree code, and
+<p>Strees are created with <code>build_s_tree</code>, a varargs function
+defined in <code>stree.c</code>. Its first argument is the stree code, and
its remaining arguments are the contents of that stree and tags to
identify their types. There is no function for creating an stree by
treating it as a "stream" to which values are written one at a time;
eventually there probably will need to be one. It won't be hard to add
it. </p>
-<p>The files <tt>stree.h</tt> and <tt>stree.c</tt> are
+<p>The files <code>stree.h</code> and <code>stree.c</code> are
language-independent, since, at bottom, strees are just a way of
packing bytes and integers into chunks. Creation and expansion of
strees are language dependent. The present implementation is focused
on C++.</p>
-<p>We change <tt>cxx_binding::value</tt> from
-type <tt>tree</tt> to type <tt>s_tree_i_or_tree</tt> (a tagged union), and we change
-<tt>IDENTIFIER_VALUE</tt> so that it returns the tree value, expanding
+<p>We change <code>cxx_binding::value</code> from
+type <code>tree</code> to type <code>s_tree_i_or_tree</code> (a
+tagged union), and we change <code>IDENTIFIER_VALUE</code> so that it
+returns the tree value, expanding
the stree if necessary. A few changes are required in functions that
-manipulate <tt>cxx_binding</tt> directly, but those changes are
+manipulate <code>cxx_binding</code> directly, but those changes are
largely mechanical and are localized
-to <tt>cp/name_lookup.[ch]</tt>. </p>
+to <code>cp/name_lookup.[ch]</code>. </p>
-<p>Strees are expanded by the <tt>s_tree_to_tree</tt> function,
-defined in <tt>cp/decl.c</tt>. There are three points to notice about
+<p>Strees are expanded by the <code>s_tree_to_tree</code> function,
+defined in <code>cp/decl.c</code>. There are three points to notice about
it. First, as described above, it uses the stree iterator
interface. Second, the first byte of the stree is the stree
-code; <tt>s_tree_to_tree</tt> uses that code to determine what kind of
-tree to create. Third, at present <tt>s_tree_to_tree</tt> doesn't
+code; <code>s_tree_to_tree</code> uses that code to determine what kind
+of tree to create. Third, at present <code>s_tree_to_tree</code> doesn't
handle any cases other than enumerators.</p>
<p>The major changes required to use strees for enumerators are
-in <tt>build_enumerator</tt>. First, we need to separate parsing and
+in <code>build_enumerator</code>. First, we need to separate parsing and
error checking from tree generation, deferring the latter until
-later. Second, for simple cases we use <tt>build_s_tree</tt> to create
-the stree and <tt>push_s_decl</tt> to enter the stree into the current
-lexical scope. In principle <tt>push_s_decl</tt> would need to know
-all of the same logic that <tt>pushdecl</tt> does; in practice we only
-use <tt>push_s_decl</tt> for the simplest cases, deferring
-to <tt>pushdecl</tt> (i.e. using trees instead of strees) for the more
+later. Second, for simple cases we use <code>build_s_tree</code> to create
+the stree and <code>push_s_decl</code> to enter the stree into the current
+lexical scope. In principle <code>push_s_decl</code> would need to know
+all of the same logic that <code>pushdecl</code> does; in practice we only
+use <code>push_s_decl</code> for the simplest cases, deferring
+to <code>pushdecl</code> (i.e. using trees instead of strees) for the more
complicated cases.</p>
<p>This design has the virtue that most of the C++ front end doesn't
have to know about strees: code that goes through bindings to get trees
looks exactly as before. It has the defect that, as presently written,
it requires code duplication. The code required to generate an
-enumerator node is in both <tt>build_enumerator</tt>
-and <tt>s_tree_to_tree</tt>. Additionally, <tt>s_tree_to_tree</tt>
+enumerator node is in both <code>build_enumerator</code>
+and <code>s_tree_to_tree</code>. Additionally, <code>s_tree_to_tree</code>
is manageable only because at the moment it only handles a single
case. If this project succeeds, and we're handling many kinds of
strees, it would become a monstrosity. The right solution will
-probably be to replace <tt>s_tree_to_tree</tt> with a wrapper function
+probably be to replace <code>s_tree_to_tree</code> with a wrapper function
that examines the stree code and dispatches to a function for the
appropriate code, and, for each code, to write an implementation
function that's shared between the tree and stree versions. Similarly,
-we can probably achieve better code sharing between <tt>pushdecl</tt>
-and <tt>push_s_decl</tt>.</p>
+we can probably achieve better code sharing between <code>pushdecl</code>
+and <code>push_s_decl</code>.</p>
<h2>Debugging information</h2>
@@ -281,7 +284,7 @@
<p>More importantly, some gcc versions already remove unneeded
declarations from debug information. GCC 3.4 does not generate DWARF
debug info for function declarations, and does not generate debug info
-for unused types unless <tt>-fno-eliminate-unused-debug-types</tt> is
+for unused types unless <code>-fno-eliminate-unused-debug-types</code> is
specified. Apple's gcc has stripped "unused" symbols out of STABS
debugging format for the last two and a half years. The debugger team
expected many bugs from users trying to examine unused declarations,
@@ -303,7 +306,7 @@
compiles.</li>
<li>Refactor the stree creation and expansion code
-in <tt>cp/decl.c</tt>to avoid code duplication. Or, to put it
+in <code>cp/decl.c</code>to avoid code duplication. Or, to put it
differently: develop a general framework for separating the early
phase (parsing and error checking) from the later stages of tree
generation and transformation.</li>
@@ -311,9 +314,9 @@
<li>Separate out the checks
required for user declarations from those for compiler-generated declarations.</li>
-<li>Get rid of the <tt>tree_or_stree</tt> struct. Instead just use
-an ordinary union, and steal one of the unused bits in <tt>cxx_binding</tt>
-to identify which member we want.</li>
+<li>Get rid of the <code>tree_or_stree</code> struct. Instead just
+use an ordinary union, and steal one of the unused bits in
+<code>cxx_binding</code> to identify which member we want.</li>
<li>Eliminate remaining global scans of decls, to make sure that we
don't expand strees unnecessarily.</li>
@@ -322,7 +325,7 @@
<h2>Contributing</h2>
-Check out <tt>stree-branch</tt> from the CVS repository. This branch
+Check out <code>stree-branch</code> from the CVS repository. This branch
is being maintained by Matt Austern, Robert Bowdidge, Geoff Keating
and Mike Stump. Patches should be marked with the
tag <code>[stree]</code> in the subject line.
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2018-09-02 13:21 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-02 13:21 [wwwdocs] projects/strees/index.html -- convert to HTML 5 Gerald Pfeifer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).