public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 15:36 Richard Kenner
  2003-05-09 15:56 ` Daniel Jacobowitz
  2003-05-09 15:57 ` Michael Matz
  0 siblings, 2 replies; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 15:36 UTC (permalink / raw)
  To: matz; +Cc: gcc

    It seems to be a confusing name, as it doesn't work locally at all.  

Then it what sense is it "local"?  Do you know?

      p1 <-- 100
      p2 <-- p1 + p3
    into
      p1 <-- 100
      p2 <-- 100 + p3

For many machines, the former is better.  CSE has lots of logic to do this
and knows which is better.  What is this code doing, why does it to things
differently than CSE, and how does it interact with CSE?  These are both
questions and things that need to be documented in the code.

I'd expect gcse to only operate on the *start* of a basic block and let
CSE make the decisions *within* the block.  But the code doesn't seem to
do that.  Without the missing documentation, I can't tell if that's a bug
in the code or is the way it was designed to work.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-12 16:25 Richard Kenner
  0 siblings, 0 replies; 56+ messages in thread
From: Richard Kenner @ 2003-05-12 16:25 UTC (permalink / raw)
  To: matz; +Cc: gcc

    > For this and other reasons I've come to the conclusion that we need to
    > separate our optimizations into generic and target dependent
    > optimizations. And that conceptually the generic optimizations should
    > strive to eliminate as much redundancy as possible, fully propagate
    > constants/copies, etc and leave it to the expanders and target
    > dependent optimizers to "undo" unprofitable redundancy elimination,
    > cprop, etc.

    Amen.

And I continue to strongly disagree with the above.  The key to GCC's
success over the years has precisely been the knowlege of the
underlying machine characteristics within the optimizers.  I certainly
agree that there are some high-level optimizations that can be
peformed in a machine-independent manner and these are the ones that
are being done on trees.

But I think that once we're at the state of optimizing RTL, all optimizers
need to know a lot about the machine they are optimizing for.  Moreover,
I believe that the RTL optimizers will always have to have duplicate
optimizations with the tree level: for example, address calculations can 
produce common subexpressions (often global) and loop invariants.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-12 11:37 Robert Dewar
  0 siblings, 0 replies; 56+ messages in thread
From: Robert Dewar @ 2003-05-12 11:37 UTC (permalink / raw)
  To: kenner, matz; +Cc: gcc

> Well, on the top of gcse.c it says, that one of the passes in GCSE is
> copy/constant propagation.  This is a standard optimization, and hence the
> actual description of any algorithm (on the high level) seems superflous.
> Tricks used in the implementation should of course be documented, but have
> no place in any high level overview comments.

copy/constant propagation is really quite vague. it can refer to various
algorithms and intended optimizations. I think it is quite important to
define terms like this. Often people have taken a compiler course, or
read some particular book, or set of papers, and they assume that terminology
is standard. That's far from true in the compiler field in general, and in
optimization in particular, so it is quite important to define terms like
this clearly.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-12 11:36 Robert Dewar
  0 siblings, 0 replies; 56+ messages in thread
From: Robert Dewar @ 2003-05-12 11:36 UTC (permalink / raw)
  To: gcc, kenner

I definitely agree with Richard that it is very important for all interfaces
to be completely documented at the time that code is put in (indeed good
practice is to write the documentation at the time the code is written!)

Certainly minimally any new functions should be fully documented, including
the exact semantics of each argument.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-12 11:16 Richard Kenner
  0 siblings, 0 replies; 56+ messages in thread
From: Richard Kenner @ 2003-05-12 11:16 UTC (permalink / raw)
  To: Richard.Earnshaw; +Cc: gcc

    Which, in RTL is

      (set b (plus (mult d 4) b))
      (set c (plus (mult d 4) c))

    into 

      (set t1 (mult d 4))
      (set b (plus t1 b))
      (set c (plus t1 c))

    is a de-optimization (takes 3 cycles instead of 2).

It's long been on my list to try to see if can find some way to turn the
latter back into the former.  The problem is that combine won't do it 
because it doesn't know it can kill T1.  But I've never come up with
any way that does it.  I think this sort of thing is very common,
especially with addressing modes on CISC machines.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-11 19:58 Richard Kenner
  2003-05-13  5:59 ` Michael S. Zick
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-11 19:58 UTC (permalink / raw)
  To: kaih; +Cc: gcc

    As there is so often, there is an answer that is perfectly obvious, but  
    not helpful in practice.

    Have a cost function that determines how good the optimization of a  
    program is.

    Run all optimizations in some predetermined sequence.

    Compare the cost function of the output to the one of the input.

    As long as there was some improvement, do it again.

    Then use the best result.

    Compile times? What do you mean, compile times?

Even aside from compile times, since some optimizations are performed
before register allocation and some after, there's no such iteration possible.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-10 18:08 Richard Kenner
  0 siblings, 0 replies; 56+ messages in thread
From: Richard Kenner @ 2003-05-10 18:08 UTC (permalink / raw)
  To: rakdver; +Cc: gcc

    good luck. If you want them be really good, they must all work globally.
    They must alst take into account what any other later optimization could
    do. This is not feasible.

It isn't feasible to do it perfectly, of course not, but if we
properly specify what later optimizations do, it's not that unreasonable
to get something that makes the right decision the vast majority of the time.

    Why it should not? You must gather the data on that you base your
    decison, which definitely is not fast if you want to do it everywhere.

Usually that data is fairly cheap to get, such as a cost function.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-10 17:04 Richard Kenner
  2003-05-10 17:18 ` Zdenek Dvorak
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-10 17:04 UTC (permalink / raw)
  To: rakdver; +Cc: gcc

    I guess that for every such heuristic you will with some care find an
    example where it does a stupidity. In general it makes more sense to
    have a few passes with really good heuristics than have some lame
    heuristic in every pass.

No, all should be good!  The strength of GCC is that it does optimization
everywhere.  We should not sacrifice that. 

    As for the speed -- I guess this is just a matter of opinion, as there
    is nothing to compare. My opinion is that the increased speed of most
    of the passes would pay up for the decreased speed of a few cleanup
    passes (they must be in some form present anyway, they already must
    gather all the data, so the slowdown should not be that dramatic).

Increased over what?  Why does a heuristic slow things down?

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-10 16:49 Richard Kenner
  2003-05-11 19:42 ` Kai Henningsen
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-10 16:49 UTC (permalink / raw)
  To: rakdver; +Cc: gcc

    just from practice -- looking where did this "pragmatic approach" bring
    gcc. Yes, we have might and cool "CSE" pass; but after 2 years working
    on gcc, I would not dare to change a single line there because I don't
    know what will happen.

That's just lack of familiarity.  I fully understand cse.c but am afraid to
change any line of gcse.c!

    no -- there are optimizations that are obviously machine-dependent
    (like scheduling, register allocation, peephole, combiner and whole
    lot of others). Then there are optimizations that are based on some
    general machine non-specific idea (like that computing one expression
    several times is wrong, or that replacing variable with its constant
    value is good -- CSE, GCSE, copy/const propagation, part of the loop
    optimizations and some other fall into this cathegory). 

I don't understand this distinction at all.  The combiner, for example, acts
by always decreasing the number of insns.  That's a very machine-independent
criteria and optimization.  All of the folding done by it and CSE are
machine-independent.

Replacing a variable with it's constant value may or may not be an
"optimization", but whether it is or not is a machine-dependent choice, so I
don't see why you consider that in the machine-dependent parts.

CSE can be viewed as very machine-dependent because the choice of which
expression is best for a given insn is dependent on lots of machine-specific
parameters.

And even "computing one expression several times is wrong" may not be correct
for simple enough expressions when register pressure is an issue.

So I really don't see *any* optimization that, done properly, is
machine-independent.  The key to GCC has been to use machine-specific
information to guide all optimizations.

    The sane way how to write a compiler is to run these generic passes
    first, then run machine-dependent passes that may eventually fix the
    mistakes the previous optimizations have done (due to their generic
    ideas being not right in some corner cases). This way we would get
    approximately the same results, but in much more transparent way.

No, I disagree.  Some machine-dependent optimization might expose more
common expressions or loop invariants.

The issue of what order to run optimizations is very tricky and I don't
think can be answered in that sort of general way.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-10 15:29 Richard Kenner
  2003-05-10 15:52 ` Zdenek Dvorak
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-10 15:29 UTC (permalink / raw)
  To: rakdver; +Cc: gcc

    > Perhaps, but a lot of that complexity is *needed*.  The reason that GCC
    > has historically beaten other compilers in code quality is paying
    > attention to important heuristics and less on "textbook" material.
    > Sure the latter is also critical, but the pragmatics cannot be ignored.

    the problem with this approach is that you will eventually end up with
    an uncomprehensible piece of code where everything depends on everything
    and you cannot do a simplest change without everything else falling
    apart.

Obviously, that's not desirable, but I don't see why it necessarily follows
from taking a pragmatic approach to optimizations.

    GCSE is basically machine independent, so there should not be a need for
    it to take machine-dependent issues into accout. These dependencies
    should be isolated in a few later passses that can take care of the
    issues you mention without harming each other.

Here I don't follow you.  *All* of the optimizers are written in a style
of being machine-independent but take into account machine-dependent
parameterizations.  Are you claiming that somehow this should not be true
for gcse?

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-10  2:21 Richard Kenner
  2003-05-10 15:12 ` Zdenek Dvorak
  2003-05-12 17:39 ` law
  0 siblings, 2 replies; 56+ messages in thread
From: Richard Kenner @ 2003-05-10  2:21 UTC (permalink / raw)
  To: jh; +Cc: gcc

    Then you will end up with never ending dataflow problems in the global
    pass.  The GCSE in the simplest definition is already too complicated
    for us to be implemented correctly (we are on it since 97 and still it
    is inferrior to what is done by other compilers).  Adding more
    complexity, like CSE has would make it more crazier.

Perhaps, but a lot of that complexity is *needed*.  The reason that GCC
has historically beaten other compilers in code quality is paying attention 
to important heuristics and less on "textbook" material.  Sure the latter
is also critical, but the pragmatics cannot be ignored.

    CSE does a lot and is powerfull pass that does approximately what is
    accomplished by multiple passes as described in literature.  The
    problem is that we don't know how to make it global and/or faster and
    we know that it is too slow right now.

A lot of the slowness is due to the "global" parts of its activities.  I've
suggested a number of times that they be turned off since they would seem
redundant with GCSE.  But when this is tried, it lowers performance, which
certainly suggests that the "complexity" of CSE is worth something.

    I am still in the dark.  Perhaps it is time to go sleep first and then
    continue in arguing.  Could you give me some example of code what you
    would like to get transfromed?

It's code that I *don't* want to see transformed.  For example, if I call
a group of functions and pass each a constant of 100 on a RISC machine where
the first operand is passed in RX, I expect to see a series of insns, each
setting RX to 100.  I don't want to see one insn setting a pseudo to 100
and then insns setting RX to that.

    >     I man extended basic block like:
    > 
    >       A
    >      / \
    >     B   C
    > 
    > That's not what cse.c means by an extended basis block.

    CSE will actually follow the jumps, so it should track it.

Yes, but that's not the case I'm talking about: I'm talking about the
extended basic block ase.

It is just pure if-then-else (I didn't draw the join edge)

    How? Assuming that B has some other instructions, I will end up with
    both, the copy and the branch.  Or do I miss something else?

I think we're talking about

	if (condition)
	   A
	else
	   B

This has two jumps.

If I can make A into a NOP, this becomes

	if (!condition)
           B

which has one jump.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 23:13 Richard Kenner
  2003-05-09 23:23 ` Jan Hubicka
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 23:13 UTC (permalink / raw)
  To: jh; +Cc: gcc

    They are tied in allocator, the point is to tie them earlier, so
    expression
    (plus (reg a)  (reg c))
    will look equivalent to
    (plus (reg a)  (reg a))

You don't need to do a replacement: you just record in a table, like
cse.c does, what all the equivalences are.

I think the confusion here is in trying to do two distinct things
simultaneously: one is to track value and the other is to actually make
changes in insns.

We all agree that it is important to have a good table of all the
equivalent values.  But the question of what substitutions in the insn
stream should be done is a completely independent question.

    You can construct cases where copy propagation increases live ranges
    of registers (when the ealiest incarantion of the value is killed so
    the later incarantions are needed) I seem to remember that even this
    subset of optimization (looking for minimal amount of copies with
    minimal number of simultatelously live registers) is NP complette.

Right.  That's why heuristics are important.

    We do have heuristics to deal with this but they don't work well either
    and I believe that they work worse than the copy propagation mechanizm
    (at least that is how do I explain the speedups I measured after
    introducing special purpose copy propagation pass)

Because you are using that pass to not only modify insns, but also track
values with gcse.  If you separate those two functions, you may find that
it's better still.

    In the case we can use it each time, the constant propagation as
    implemented will do that (replace every occurence of the pseudo by
    constant).  I tought you are speaking about situation where it is not
    good idea to replace the reigsters by constant?

That's in the case of a reg-reg assignment, not a use of a constant
within an expression.

    I man extended basic block like:

      A
     / \
    B   C

That's not what cse.c means by an extended basis block.

    Now CSE is deciding whether NOP will be in A or B.  A is put before B
    (this is usual) and B constains some other instructions.  Now it will
    place the NOP in B and keep copy in A and it can be the case that A is
    executed 1000000 times while B not at all.

You're forgetting that the case in question had *two* branches.  If the inner
isn't executed, you're replacing a branch with a copy, which is much faster
on modern machines. 

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 22:51 Richard Kenner
  2003-05-09 22:59 ` Jan Hubicka
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 22:51 UTC (permalink / raw)
  To: jh; +Cc: gcc

    For each use of the register (reg c) you look for the chain of
    operations preceeding it like:
    (set (reg b) (reg a)
    (set (reg c) (reg b)
    and you replace the use by the oldest copy of the value (reg a) in this
    case.

Why?  That seems, at best, to accomplish nothing and, at worst, to
pessimize the code.  I'm not familiar with the new register allocator,
but in the original one, all of those registers, assuming they are local,
will be tied together and all allocated the same hard register.

It would seem to me that when you do this optimization, you run the
real risk of increasing the lifetime of a register over what it was
before, especially if you consider the cases where some of the
registers in question are hard regs and some are global.  Don't we already
have code that knows how to handle these cases efficiently?

    I see, you are looking for something like (set (reg a) (reg b)) where
    reg b is known to be 100.  Constant propagation will turn this into
    (set (reg a) (reg 100)) but it is possible that copy propagation would
    eeliminate the need for reg a entirely replacing all the uses by reg b
    (or this can be done in register allocation).

I'm talking about register allocation, not copy propagation.

    In the compiler books this does not matter; 

A *lot* of things "don't matter" in compiler books: that's why they are
*books*.  ;-)

    It is not perfect, as in the case user already wrote code initializing
    variable to 100 many times, we won't cache the copy of 100 in a
    register.

But if the constant of 100 can fit into one insn, why would you *want to*
cache it in a register?  It's more efficient to use it each time rather
than waste a register.

    This is also interesting.  How well does the heuristic work?  In fact
    I would expect that it will often fail to elliminate the else block
    and result in moving NOP to the less frequently executed place in the
    superblock.

I'm not sure what you mean here.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 22:30 Richard Kenner
  2003-05-09 22:51 ` Jan Hubicka
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 22:30 UTC (permalink / raw)
  To: jh; +Cc: gcc

    The definition I find in all three books on compilers I have the
    algorithm is documented as maintaining hash of available expressions and
    when there is a hit it inserts a copy instruction at one side and use on
    the other.  This is what GCSE does.

I'd call that a rather over-specified definition.  I wouldn't say that a
key attribute of a CSE algorithm would be the insertion of those copies.
Indeed, cse.c doesn't *want* to do that since it would increase the number
of insns considerably.  It's not at all clear to me that the number of
times such copies would win over the more straightforward approach that 
cse.c does justifies the overhead in all the new insns and pseudos.

I agree that it makes sense for a global cse to do such a thing, however,
because then cse.c can clean it up.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 22:26 Richard Kenner
  2003-05-09 22:44 ` Jan Hubicka
                   ` (2 more replies)
  0 siblings, 3 replies; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 22:26 UTC (permalink / raw)
  To: jh; +Cc: gcc

    This would not work for the copy propagation (as we need to remove the
    reference to the register in order to get dead code removed).

Can you define "copy propagation".  I'm not familiar with the term.

    Can you do some benchmarking on ROMP?  

No, it's a dead machine.  I was just giving that as an example of the sort
of problems you can get into by blindly replacing registers by constants.

It can be a tricky issue.  On the one hand, if you have a reg-reg SET with
at least one being a pseudo, if you replace the source with a constant, you
may have replaced a NOP with an insn that does something.  On the other,
doing that replacement may have eliminated the last use of another pseudo.

    I was worried about that when I was adding the local coprop pass (that
    don't introduce the problem just make it worse) and i did evaulation
    on SPARC (maybe MIPS, don't remember exactly) and the code produced
    was faster and shorter on the average so I concluded that this should
    be mostly safe.  When the constant is expensive, it usually means that
    it needs to be loaded to the register always and then the pattern
    should refuse immediate operand.

Clearly, but the issue isn't just when you need more than one instruction
to load the constant.  See above.  You run a risk of turning an insn that
will become a NOP into a real insn.

In some sense, the question is where is it better to have a NOP.  But in
the case I mentioned, where the NOP is in a conditional context, it's best
to have it there.

In other words, if we have

	if (condition)
	   a = exp1;
        else
	   a = exp2;

and we have a choice of putting the NOP before the "if" or making the
assignment from exp1 be a NOP, the latter is preferred since it eliminates
a junk.  Thus the heuristic is to have the NOP as *late* as possible.  But
I think the gcse.c code moves it as *early* as possible.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 22:17 Richard Kenner
  2003-05-09 22:24 ` Jan Hubicka
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 22:17 UTC (permalink / raw)
  To: jh; +Cc: gcc

    No, the CSE algorithm as usually defined in the books would simplify
    for instance:
    (set (reg a) (plus b c))
    (use a)
    (set (reg a) (something else))
    (set (reg b) (plus b c))

    But our alrgorithm won't, but it would do different things CSE described
    in the books won't, so it is not CSE as rest of the world know it and
    naming it CSE is missleading.

Well I'm not sure what the "usual" cse algorithm would simplify this into
since the result of the addition is no longer around, but defining an
algorithm by what it would do in an obscure case seems odd to me.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 21:36 Richard Kenner
  2003-05-09 22:19 ` Jan Hubicka
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 21:36 UTC (permalink / raw)
  To: rth; +Cc: gcc

    Because cse.c is too slow.  It cannot be used for code cleanup
    within another pass.  Compilation time being high on everyone's
    minds, I think this is an exceedingly good reason to have two
    passes.

Fair enough, but it would seem that all it has to do is to add a
REG_EQUAL note to accomplish its inter-pass purposes and then let CSE
decide whether the zero or a register is best.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 16:39 Benjamin Kosnik
  2003-05-09 18:22 ` Phil Edwards
  0 siblings, 1 reply; 56+ messages in thread
From: Benjamin Kosnik @ 2003-05-09 16:39 UTC (permalink / raw)
  To: gcc


For showing interrelationships between functions, you might find the
doxygen maps useful. 

http://www.jaj.com/space/phil/gccdoxy/gccdoxy_20021228.tar.bz2

-benjamin

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 16:18 Richard Kenner
  2003-05-09 22:06 ` Jan Hubicka
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 16:18 UTC (permalink / raw)
  To: matz; +Cc: gcc

    Well, when dead code is removed this will look like:

      p2 <-- 100 + p3

Only if p1 is, in fact, dead.

    But this needs to be determined at a later point, instead of cluttering
    all the optimization passes with machine dependend checks.

You lost me here.  All of RTL and every choice within it is implicitly
machine-dependent.  Every cost calculation is machine-dependent.  The
entire RTL optimization strategy is based on making machine-dependent
decisions (of course, written in machine-independent ways).

    CSE as we implement it is fundamentally different from gcse.  For one
    our CSE is not any CSE but a kind of ad hoc value numbering
    intertwined with expression simplification.

Again, you lost me.  It eliminates common subexpressions.  Isn't that what
"cse" stands for?

    And I can't find a function in cse.c which does something similar
    (i.e. just propagating constants and copies).  

Well, fold_rtx, I suppose, but that's because it does everything at
one time.  And, as I've pointed out, it's not always better to
substitute a constant for a register: sometimes, it's best to do it
the other way around.  I have a favorite example for this:  the (non-obsolete)
ROMP chip had an ABI where the return register and the first argument 
register were the same. It also had a conditional return insn.  So if you
have a function with one operand, say X, and you have:

	if (x == 0)
	  return 0;

What you really want to do with the insn setting the return value is
to change *from* the 0 *to* the equivalent, X.  That way that insn is a
no-op, and you can do this as a conditional return insn.

CSE gets this right because it uses the cost metric.  It sounds like the
code in gcse.c will get this wrong.

    Like I said, local_cprop* has nothing to do with gcse itself, except being
    run to clean up the code between the gcse iterations.

Then that's all the *more* reason why it should be documented!

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 16:08 Richard Kenner
  2003-05-09 21:04 ` Richard Henderson
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 16:08 UTC (permalink / raw)
  To: drow; +Cc: gcc

    We don't live in a void of one flat source tree.  While I'm not arguing
    that documentation would be better, it's still easy to find the history
    of this code.  It was added last July by Jan, in this patch:
      http://gcc.gnu.org/ml/gcc-patches/2002-07/msg00617.html

Indeed now that I see that message, I remember it.  But it doesn't answer the
question I have: why is it appropriate to have *two* constant propagation
routines within a basic block where only one (the one in cse.c) knows how to
properly apply cost functions?  At the time I saw this message, I was dubious
about this and thought I even raised the issue on the lists.

Moreover, this goes counter to the claim that was made just a few hours ago,
which is that the "local" did *not* refer to doing something that was
non-global, thus showing that there is indeed major confusion due to lack of
documentation.

However, the patch as cited in the message should *not* have been approved,
both because of the missing documentation, and because of the confusion about
the role of CSE in this task, as exemplified by the last sentence of the
posting: "I am surprised our local CSE don't know how to handle it, but I
will check it separately."  Since that "check" should have resulted in
not needing this patch, it needed to be done before the patch was approved.

My point in raising this was not to specifically get the information on
the code (as you point out, it's not hard to find the message where a 
particular piece of code was first proposed), but to address the more general
issue that there are *lots* of code that are improperly documented and it's
unreasonable to do this sort of search for *every* such function.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 14:10 Richard Kenner
  2003-05-09 14:37 ` Michael Matz
  0 siblings, 1 reply; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 14:10 UTC (permalink / raw)
  To: matz; +Cc: gcc

    Also before the *local_cprop* functions, except local_cprop_pass(),
    which is only called at toplevel from one_cprop_pass().  

Also, what does "local" mean in this context?  I'd take it as the opposite
of "global", meaning to me that this does an optimzation on a small part of
the function being compiled.  Which part?  I see no documentation of that.

Or is this just a confusing name?

^ permalink raw reply	[flat|nested] 56+ messages in thread
* Re: An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 14:01 Richard Kenner
  0 siblings, 0 replies; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 14:01 UTC (permalink / raw)
  To: matz; +Cc: gcc

    I don't know about you.  But when I look at gcse.c, I see heaps of
    comments, before most of the functions.  

The problem is that it's a requirement that it be before *all* functions,
not "most".  I count nine that are missing it and didn't look at how many
are incomplete (see one example below).

    Also before the *local_cprop* functions, except local_cprop_pass(), which
    is only called at toplevel from one_cprop_pass().

But if I'm looking to see what a specific local_cprop function does,
I'm going to look in front of local_cprop_pass, not try to find the function
that calls it and look there.

Now let's consider the documentation of one_cprop_pass.  It refers to a
"copy/constant propagation pass".   I did a search of the entire file for
the string "copy/constant propagation" and find *no* description of what
that term means.  Yes, most people would know what "constant propagation"
is, but what does "copy" mean in that context?

    Also the whole gcse.c is documented on the top of that file rather
    extensively.

No, I don't consider it "extensive" at all.  Yes, it gives a very broad
overview of what's being done, but doesn't link that overview to the
implementation.

    And additionally a heap of paper references are given
    for people really interested in the theoretic background.

That's good, but I suspect that the comments also assume that every
reader has read all the papers.  If it's the desire to rely on a paper
as a prerequisite for documentation, there should be a pointer to *one*
paper and a URL where it can be found.

    Well, on the top of gcse.c it says, that one of the passes in GCSE is
    copy/constant propagation.  This is a standard optimization, and hence
    the actual description of any algorithm (on the high level) seems
    superflous.

I disagree.  There are lots of "standard" optimizations, but once you
go from the realm of a paper into an actual implementation, you need to
say *precisely* what you've done.

    Tricks used in the implementation should of course be
    documented, but have no place in any high level overview comments.

Agreed, but what *is* important in that high-level overview, and is missing,
is an explanation of how the optimizations in this file interact with other
similar optimizations.  Specifically, it should say what the boundaries
between what's done there and what's done in CSE are.

I also see absolutely no discussion of the critical issue of when constants
should replace registers: CSE has a lot of code to do that and if it's being
done different in gcse there needs to be documentation saying why.

    About the actual problem you have: There is code to adjust the
    REG_EQUAL notes for changed libcall blocks, namely in
    adjust_libcall_notes() called from do_local_cprop(), which sounds as
    if it should do what you wanted.  Ergo you probably want to look at
    that function, why it doesn't work.  The third match for "LIBCALL" set
    me on the right track, so it wasn't exactly difficult to find.

OK, but just as an example for what's missing, I'm looking at 

/* LIBCALL_SP is a zero-terminated array of insns at the end of a libcall;
   their REG_EQUAL notes need updating.  */

static bool
do_local_cprop (x, insn, alter_jumps, libcall_sp)

It's nice that *one* operand to that function is documented, but it would
be even better if *all* of them were and better still if there were at least
one sentence saying what the function as a whole is supposed to do.

Getting back to adjust_libcall_notes: there are absolutely no comments
within the routine.  There's a call to reg_set_between_p whose purpose
isn't obvious to me and nothing in the function header comments say why
register usage would be important.  Since there are no comments in the
function itself, there's no way to know what was intended there.  Perhaps
that's related to the bug, perhaps not, but without knowing the *purpose*
of that call, it's hard to say.

^ permalink raw reply	[flat|nested] 56+ messages in thread
* An issue for the SC: horrible documentation quality of GCC
@ 2003-05-09 12:12 Richard Kenner
  2003-05-09 12:41 ` Michael Matz
                   ` (3 more replies)
  0 siblings, 4 replies; 56+ messages in thread
From: Richard Kenner @ 2003-05-09 12:12 UTC (permalink / raw)
  To: gcc

I want to repeat what I just said in a separate message: the amount of
totally undocumented code in GCC is approaching very troublesome levels.
There are who areas of the compler in which nobody but their author can
work due to a complete lack of any high-level documentation.

The local_cprop_* functions in gcse are a very good case in point.

There is no comment anywhere that I can find of the form "This phase performs
the ABC optimization by doing the following algorithm.  This optimization
is useful because XYZ and does not duplicate a similar optimization in QQQ
because of BAR."

From the name, I can conclude this has *something* to do with "local
constant propagation".  If I studied the code long enough, I could probably
figure out what it actually *does*.

But there is no way to figure out the really important part of the
missing documentation, which is what the phase is *supposed* to do and
the motivations for the new phase.

This is by no means the first time this has happened: it most cases when I've
had to look at a relatively-recent part of the compiler, I find this lack
of documentation and I have complained about it before.

There's no point in trying to understand how we got to this appalling state
of affairs, but we do need to decide how to proceed from now.  My suggestion
is the following:

(1) Do not allow any further patches to the compiler that do not have proper
documentation.

(2) Remove any parts of the compiler for which documentation is not supplied
by the 3.4 release.

It is very clear that the approach of supplying documentation later
does not work.

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2003-05-13  5:59 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-09 15:36 An issue for the SC: horrible documentation quality of GCC Richard Kenner
2003-05-09 15:56 ` Daniel Jacobowitz
2003-05-09 15:57 ` Michael Matz
2003-05-09 16:34   ` David Edelsohn
2003-05-09 22:01   ` Jan Hubicka
  -- strict thread matches above, loose matches on Subject: below --
2003-05-12 16:25 Richard Kenner
2003-05-12 11:37 Robert Dewar
2003-05-12 11:36 Robert Dewar
2003-05-12 11:16 Richard Kenner
2003-05-11 19:58 Richard Kenner
2003-05-13  5:59 ` Michael S. Zick
2003-05-10 18:08 Richard Kenner
2003-05-10 17:04 Richard Kenner
2003-05-10 17:18 ` Zdenek Dvorak
2003-05-10 16:49 Richard Kenner
2003-05-11 19:42 ` Kai Henningsen
2003-05-10 15:29 Richard Kenner
2003-05-10 15:52 ` Zdenek Dvorak
2003-05-10 16:09   ` David Edelsohn
2003-05-10 16:17     ` Zdenek Dvorak
2003-05-10 16:57     ` Michael S. Zick
2003-05-12  9:07   ` Richard Earnshaw
2003-05-10  2:21 Richard Kenner
2003-05-10 15:12 ` Zdenek Dvorak
2003-05-12 16:05   ` law
2003-05-12 16:20     ` Michael Matz
2003-05-12 17:39 ` law
2003-05-12 20:15   ` Richard Henderson
2003-05-09 23:13 Richard Kenner
2003-05-09 23:23 ` Jan Hubicka
2003-05-09 22:51 Richard Kenner
2003-05-09 22:59 ` Jan Hubicka
2003-05-09 22:30 Richard Kenner
2003-05-09 22:51 ` Jan Hubicka
2003-05-09 22:26 Richard Kenner
2003-05-09 22:44 ` Jan Hubicka
2003-05-09 22:48 ` Joe Buck
2003-05-10 12:07 ` Richard Earnshaw
2003-05-09 22:17 Richard Kenner
2003-05-09 22:24 ` Jan Hubicka
2003-05-09 21:36 Richard Kenner
2003-05-09 22:19 ` Jan Hubicka
2003-05-09 16:39 Benjamin Kosnik
2003-05-09 18:22 ` Phil Edwards
2003-05-09 16:18 Richard Kenner
2003-05-09 22:06 ` Jan Hubicka
2003-05-09 16:08 Richard Kenner
2003-05-09 21:04 ` Richard Henderson
2003-05-09 14:10 Richard Kenner
2003-05-09 14:37 ` Michael Matz
2003-05-09 14:01 Richard Kenner
2003-05-09 12:12 Richard Kenner
2003-05-09 12:41 ` Michael Matz
2003-05-09 13:25 ` Paul Koning
2003-05-09 17:23 ` Joseph S. Myers
2003-05-09 20:30 ` Scott Robert Ladd

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).