Re: Reassociation

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: Reassociation
@ 1998-12-16 19:37 tprince
  1998-12-16 20:08 ` Reassociation Jeffrey A Law
  0 siblings, 1 reply; 13+ messages in thread
From: tprince @ 1998-12-16 19:37 UTC (permalink / raw)
  To: bosch, burley, egcs, hjstein, jbuck, law, moshier

Fortunately, we're not facing the dilemma of needing something
for g77 which is not wanted otherwise.  What Jeff has proposed
is in accordance with the excerpts from C9X which I quoted
earlier, as well as being in accordance with Fortran.  I doubt that
the C9X people really intended to help out egcs/g77, but they
may have done so in this case.

I suppose the next comment we'll see is "they're trying to make C
look more like Fortran."  But, I think there is a sufficient body of
experts who have taken a satisfactory position on what the
proper level of reassociation should be for normal optimization.

 gcc and g77 could become examples of this, and provide an
alternative to "all-or-nothing" compilers which try to give a choice
of going too far or not far enough, with a poorly documented
group of options unlike those used for benchmarks needed to
approximate correct behavior.  No matter if another flag or two
come into egcs, as long as we don't need to use them to get
reasonable behavior.

 >>>
Date: Wed, 16 Dec 1998 12:37:01 -0500 (EST)
From: Craig Burley <>

>There has been general agreement that gcc is not supposed to
make
>optimizations that could change the value of an expression.
>Associative law optimizations of floating-point expressions
certainly
>can change the value and they should continue to be
disallowed.  Users
>should not have to fear that gcc is going to take arbitrary
liberties
>and rewrite their programs for them without very explicit
permission.

Even if this was true -- and IMO it isn't -- it doesn't apply to well-
designed numerical-processing languages like Fortran, in which
it
is quite clear that A*B*C*D can be evaluated as (D*A)*(C*B), for
example.

Therefore, we should make the *back end* work properly in a
language-
independent sense.  If people want the gcc to produce slower
code
than g77, that's okay -- they can have gcc explicitly group the
operations for consumption by the back end, something Fortran
doesn't
generally need to do.

        tq vm, (burley)

Dr. Timothy C. Prince
Consulting Engineer
Solar Turbines, a Caterpillar Company
alternate e-mail: tprince@computer.org

           To:                                              INTERNET - IBMMAIL
                                                            N3356140 - IBMMAIL

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Reassociation
  1998-12-16 19:37 Reassociation tprince
@ 1998-12-16 20:08 ` Jeffrey A Law
  1998-12-17  9:24   ` Reassociation Gavin Romig-Koch
  0 siblings, 1 reply; 13+ messages in thread
From: Jeffrey A Law @ 1998-12-16 20:08 UTC (permalink / raw)
  To: tprince; +Cc: bosch, burley, egcs, hjstein, jbuck, moshier

  In message < 4.19981216.22.37.12.633453@cat.e-mail.com >you write:
  > Therefore, we should make the *back end* work properly in a language-
  > independent sense.  If people want the gcc to produce slower code than
  > g77, that's okay -- they can have gcc explicitly group the operations for
  > consumption by the back end, something Fortran doesn't generally need to
  > do.
GCC's backend is supposed to be language independent.  Before implementing
any of the optimizations I mentioned we've have to have a reasonable way to
describe in the tree structures what reassociations are permitted and which
are not.

The obvious way to do that in this case is:

	* Introduce the concept of parens into the tree nodes for
	  expressions.

	* Honor parens and perform no associations across them.

	* If a language does not allow reassociations, then it can make the
	  implicit parents explicit in the tree structure.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Reassociation
  1998-12-16 20:08 ` Reassociation Jeffrey A Law
@ 1998-12-17  9:24   ` Gavin Romig-Koch
  1998-12-17 10:45     ` Reassociation Stephen L Moshier
  1998-12-17 20:00     ` Reassociation Jeffrey A Law
  0 siblings, 2 replies; 13+ messages in thread
From: Gavin Romig-Koch @ 1998-12-17  9:24 UTC (permalink / raw)
  To: law; +Cc: tprince, bosch, burley, egcs, hjstein, jbuck, moshier

Jeffrey A Law writes:
 > The obvious way to do that in this case is:
 > 
 > 	* Introduce the concept of parens into the tree nodes for
 > 	  expressions.
 > 	* Honor parens and perform no associations across them.
 > 
 > 	* If a language does not allow reassociations, then it can make the
 > 	  implicit parents explicit in the tree structure.

While I agree that the front end needs a way to tell the backend
about what it can and can't optimize, I'm not sure parens in the
tree are the best way to do this.  (I've not been able to think this
completely through, and so it may be bunk, but I wanted to say something
before I forgot.)

If an arithmatic operation can't be reassociated using the mathimatical
reassociation rules, it's because of the possibilty of overflow, or
underflow, or div-by-zero (are their other?).  Collectively let me
call these side-effects (though perhaps these aren't necessarly the
same as C "side-effects").

If an operation can be reassociated, it's either because the front end
doesn't care about the effects of these side-effects, or because the
backend can determine that side-effects won't happen for a particular
operation, perhaps because of the operation and its operators, or 
perhaps because the architecture your compiling for doesn't detect
these side-effects.

So, rather than have the front end insert parens in the tree to
prevent reassociation, it would be better to have a way for the
front end to tell the back end wether or not it cares about
side-effect for a particular operation.

What's not clear to me is whether this "dont-care-about-side-effects"
flag needs to be specified per individual operation, or per function,
or per frontend, or per operation type?  Do we need separate flags
for each kind of side effect (care-about-overflow, dont-care-about-underflow)?

                                        -gavin...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Reassociation
  1998-12-17  9:24   ` Reassociation Gavin Romig-Koch
@ 1998-12-17 10:45     ` Stephen L Moshier
  1998-12-17 19:56       ` Reassociation Jeffrey A Law
  1998-12-17 20:00     ` Reassociation Jeffrey A Law
  1 sibling, 1 reply; 13+ messages in thread
From: Stephen L Moshier @ 1998-12-17 10:45 UTC (permalink / raw)
  To: Gavin Romig-Koch; +Cc: law, tprince, bosch, burley, egcs, hjstein, jbuck

On Thu, 17 Dec 1998, Gavin Romig-Koch wrote:
> Jeffrey A Law writes:
>  > The obvious way to do that in this case is:
>  > 
>  > 	* Introduce the concept of parens into the tree nodes for
>  > 	  expressions.
>  > 	* Honor parens and perform no associations across them.
>  > 
>  > 	* If a language does not allow reassociations, then it can make the
>  > 	  implicit parents explicit in the tree structure.
> 
> While I agree that the front end needs a way to tell the backend
> about what it can and can't optimize, I'm not sure parens in the
> tree are the best way to do this.  (I've not been able to think this
> completely through, and so it may be bunk, but I wanted to say something
> before I forgot.)

Yes, currently this is analyzed at rtl level in cse.c, which already
knows all about the associative law.  If you look at the rtl dumps for
a*b*c*d versus (a*b)*(c*d) you can see they to have a distinctive
character that might be detected the same way cse.c detects other
situations.  So I think you're probably right that tree level
structures do not have to be involved.

> If an arithmatic operation can't be reassociated using the mathimatical
> reassociation rules, it's because of the possibilty of overflow, or
> underflow, or div-by-zero (are their other?).  Collectively let me
> call these side-effects (though perhaps these aren't necessarly the
> same as C "side-effects").

In general, floating point arithmetic does not obey the associative law.
Pretending that it does involves an informed decision about how to
indicate what you will or will not associate.  In the case exhibited
so far, Fortran, the programmer indicates that by leaving out some
parentheses.  The programmer could leave out whatever he wants, and
the compiler is then permitted to do something with that code.

> If an operation can be reassociated, it's either because the front end
> doesn't care about the effects of these side-effects, or because the
> backend can determine that side-effects won't happen for a particular
> operation, perhaps because of the operation and its operators, or 
> perhaps because the architecture your compiling for doesn't detect
> these side-effects.

As I hear it, it's up to the user.  The compiler doesn't decide what
may or may not be allowed.  The programmer indicates specifically what
operations can be rearranged.  I thought at first the intent
was to let the instruction scheduler decide whether a permitted
change would actually be carried out.  Maybe that is the intent but
I don't think it's been made clear yet.

> So, rather than have the front end insert parens in the tree to
> prevent reassociation, it would be better to have a way for the
> front end to tell the back end wether or not it cares about
> side-effect for a particular operation.

Currently GCC does not allow any reassociation at all for floating
point operations.  This is done by turning off potential associative
law optimizations in cse.c and combine.c.  So maybe you could approach
the new feature as a slight modification of whats's already there.
You would filter the rtl to detect whether parentheses had been
left out and if so, and if the feature flag is on, then the optimization
would proceed.  Probably some new pattern-detecting cases have to be
added also.  Maybe cse.c is not now looking for a*b*c*d, for instance.

> What's not clear to me is whether this "dont-care-about-side-effects"
> flag needs to be specified per individual operation, or per function,
> or per frontend, or per operation type?  Do we need separate flags
> for each kind of side effect (care-about-overflow, dont-care-about-underflow)

At least three people have voted for a command-line flag that turns this whole
feature on and off.  So far we seem to have only the Fortran example
of how the programmer would indicate the specific don't-care code.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Reassociation
  1998-12-17 10:45     ` Reassociation Stephen L Moshier
@ 1998-12-17 19:56       ` Jeffrey A Law
  0 siblings, 0 replies; 13+ messages in thread
From: Jeffrey A Law @ 1998-12-17 19:56 UTC (permalink / raw)
  To: moshier; +Cc: Gavin Romig-Koch, tprince, bosch, burley, egcs, hjstein, jbuck

  In message < Pine.LNX.4.05.9812171311060.31450-100000@moshier.ne.mediaone.net >
you write:
  > > 
  > > While I agree that the front end needs a way to tell the backend
  > > about what it can and can't optimize, I'm not sure parens in the
  > > tree are the best way to do this.  (I've not been able to think this
  > > completely through, and so it may be bunk, but I wanted to say something
  > > before I forgot.)
  > 
  > Yes, currently this is analyzed at rtl level in cse.c, which already
  > knows all about the associative law.  If you look at the rtl dumps for
  > a*b*c*d versus (a*b)*(c*d) you can see they to have a distinctive
  > character that might be detected the same way cse.c detects other
  > situations.  So I think you're probably right that tree level
  > structures do not have to be involved.
The tree structures do need to get involved at some point.

Some languages (ie fortran) allow reassociation as long as we honor
paren boundaries.  We need some way to find those boundaries.  How do
you suggest we find them?

See fold-const.c  It does most of the same transformations on trees that
cse.c is doing on RTL.


jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Reassociation
  1998-12-17  9:24   ` Reassociation Gavin Romig-Koch
  1998-12-17 10:45     ` Reassociation Stephen L Moshier
@ 1998-12-17 20:00     ` Jeffrey A Law
  1 sibling, 0 replies; 13+ messages in thread
From: Jeffrey A Law @ 1998-12-17 20:00 UTC (permalink / raw)
  To: Gavin Romig-Koch; +Cc: tprince, bosch, burley, egcs, hjstein, jbuck, moshier

  In message < 13945.15741.725261.89997@cetus.cygnus.com >you write:
  > While I agree that the front end needs a way to tell the backend
  > about what it can and can't optimize, I'm not sure parens in the
  > tree are the best way to do this.  (I've not been able to think this
  > completely through, and so it may be bunk, but I wanted to say something
  > before I forgot.)
I was just throwing out one idea.  Having the front end indicate which
expressions can be reassociated is another obvious approach.

I haven't through through either far enough to make an informed guess about
which would be better.


  > What's not clear to me is whether this "dont-care-about-side-effects"
  > flag needs to be specified per individual operation, or per function,
  > or per frontend, or per operation type?  Do we need separate flags
  > for each kind of side effect (care-about-overflow, dont-care-about-underflo
  > w)?
I think it needs to be per expression.  I don't know if we need to break it
into multiple "don't care about XYZ flags".

In any event, this isn't something we're likely to have soon.  I'd like to
suggest we slow down this discussion -- it's generating a ton of mail about
something we may not have the time to do anytime soon.

jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Reassociation
  1998-12-20 15:36 ` Reassociation Richard Henderson
@ 1998-12-20 16:22   ` Stephen L Moshier
  0 siblings, 0 replies; 13+ messages in thread
From: Stephen L Moshier @ 1998-12-20 16:22 UTC (permalink / raw)
  To: Richard Henderson; +Cc: law, egcs

On Sun, 20 Dec 1998, Richard Henderson wrote:

> No, because that looses on ((a*b)*c)*d. 
> 
> We must add something to the tree.  There is no way to determine
> where the parentheses were.

Yes, I see you're right.  No way to tell whether the source program
had those particular parentheses or omitted them.  They generate the
same rtl as a*b*c*d.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Reassociation
  1998-12-19  5:30 Reassociation Stephen L Moshier
@ 1998-12-20 15:36 ` Richard Henderson
  1998-12-20 16:22   ` Reassociation Stephen L Moshier
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Henderson @ 1998-12-20 15:36 UTC (permalink / raw)
  To: moshier, law, egcs

On Sat, Dec 19, 1998 at 08:30:27AM -0500, Stephen L Moshier wrote:
> By looking at the rtl.  For example, in the rtl for a*b*c*d all the
> set destinations are the same, a giveaway that the parentheses were
> omitted.

No, because that looses on ((a*b)*c)*d. 

We must add something to the tree.  There is no way to determine
where the parentheses were.


r~

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Reassociation
@ 1998-12-19  5:30 Stephen L Moshier
  1998-12-20 15:36 ` Reassociation Richard Henderson
  0 siblings, 1 reply; 13+ messages in thread
From: Stephen L Moshier @ 1998-12-19  5:30 UTC (permalink / raw)
  To: law, egcs

    Some languages (ie fortran) allow reassociation as long as we honor
    paren boundaries.  We need some way to find those boundaries.  How do
    you suggest we find them?

By looking at the rtl.  For example, in the rtl for a*b*c*d all the
set destinations are the same, a giveaway that the parentheses were
omitted.  I suggest asking Wilson about this; he would probably know
whether this is a feasible plan in general.

    See fold-const.c  It does most of the same transformations on trees
    that cse.c is doing on RTL.

That's fine, but the optimizations in point are now being squelched
specifically in cse.c.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Reassociation
  1998-12-17  8:29 Reassociation tprince
@ 1998-12-17 20:03 ` Jeffrey A Law
  0 siblings, 0 replies; 13+ messages in thread
From: Jeffrey A Law @ 1998-12-17 20:03 UTC (permalink / raw)
  To: tprince; +Cc: bosch, burley, egcs, hjstein, moshier, tim

  In message < 4.19981217.11.29.23.594944@cat.e-mail.com >you write:
  > In view of the fact that left-to-right evaluation has been expected
  > from C compilers and that most current Fortran compilers follow
  > that scheme, I have tended to take that into account when setting
  > up expressions where the compiler may apply Common
  > Sub-expression Evaluation.  I don't see that as anything more
  > than a mild suggestion as to how the compiler should treat my
  > code.
Yup.  We often call this the principle of least suprise.  It may be the case
that enough folks expect C to strictly evaluate these expressions left to right
and that changing it would cause massive problems, in which case we need to
think long and hard about how wise it is to "break" such code.

jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: REASSOCIATION
  1998-12-17 14:26 REASSOCIATION tprince
@ 1998-12-17 18:26 ` Joern Rennecke
  0 siblings, 0 replies; 13+ messages in thread
From: Joern Rennecke @ 1998-12-17 18:26 UTC (permalink / raw)
  To: tprince; +Cc: bosch, burley, egcs, hjstein, jbuck, law, moshier

> Not AFAIK for C or Fortran based languages.  What we're talking
> about is the ability to let the programmer choose between
> specifying the order of evaluation, without the translator caring
> why, or letting the translator exploit the usual associativity rules
> to come up with more efficient scheduling.

Now this is one more argument for 80-bit-spills on the x86: if we do that,
and the -ffast-math flag is used, then it should be fine to reassociate
floating point multiplies.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: REASSOCIATION
@ 1998-12-17 14:26 tprince
  1998-12-17 18:26 ` REASSOCIATION Joern Rennecke
  0 siblings, 1 reply; 13+ messages in thread
From: tprince @ 1998-12-17 14:26 UTC (permalink / raw)
  To: bosch, burley, egcs, hjstein, jbuck, law, moshier

>>While I agree that the front end needs a way to tell the
backend
about what it can and can't optimize, I'm not sure parens in the
tree are the best way to do this.  (I've not been able to think this
completely through, and so it may be bunk, but I wanted to say
something
before I forgot.)<<

Not that I care about implementation details, but it probably
doesn't need a new mechanism.  The same code generation
should come about whether it's written

(a*b)*c)*d

or (too clumsily for normal source code, but the way it was done
in K&R C)

temp= a * b
temp *= c
temp * d

with the reassociation being permitted only when neither
parentheses nor assignments intervene.

>>  Do we need separate flags
for each kind of side effect (care-about-overflow,
dont-care-about-underflow)?<<

Not AFAIK for C or Fortran based languages.  What we're talking
about is the ability to let the programmer choose between
specifying the order of evaluation, without the translator caring
why, or letting the translator exploit the usual associativity rules
to come up with more efficient scheduling.

 We're not even giving the translator the option of saying we know
that we're running on a target with extra exponent range, so the
over/under-flow which the programmer must have been
guarding against can't happen here.
Dr. Timothy C. Prince
Consulting Engineer
Solar Turbines, a Caterpillar Company
alternate e-mail: tprince@computer.org

           To:                                              INTERNET - IBMMAIL
                                                            N7683011 - IBMMAIL

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: Reassociation
@ 1998-12-17  8:29 tprince
  1998-12-17 20:03 ` Reassociation Jeffrey A Law
  0 siblings, 1 reply; 13+ messages in thread
From: tprince @ 1998-12-17  8:29 UTC (permalink / raw)
  To: bosch, burley, egcs, hjstein, law, moshier, tim

         Date: 12/17/98
         From: Tim Prince                                   PRINCTC3 - SOLARDEC
      Subject: RE: Reassociation

Date:  12/17/1998  08:12 am  (Thursday)
From:  Tim Prince
To:  MRG41."RALPH:MRS:SN_L=IBMMAIL:SN_U=N4244063",
     A1.MRS_RALPH.SN_U=INTERNET|SN_L=IBMMAIL
Subject:  RE: Reassociation

While I have only the summary articles of the IEEE P754 and
P854 standards, I am convinced that these standards deal only
with single operations, not with groupings where reassociation
is possible.  Except that the reason for choosing exponents
which increase by 3 bits with each increase in format width is to
assure that the wider format will protect against intermediate
over/under-flow when 3 operands (of the next smaller format) are
multiplied together.  If the multiplication of 4 operands produces
over or under-flow in spite of the width increase, it will happen
regardless of reassociation.  Clearly, this was done to relieve the
programmer of concerns about intermediate over/under-flow.

Fortran compilers always have been expected to perform
reassociations such as Jeff mentioned, with parentheses and
assignments being treated as barriers to reassociation.  As I
pointed out yesterday, there are reputable compilers which go
beyond that, but I side with those who say that shouldn't happen
without a specific additional option selection.

I am no C expert, but my copy of draft C9X standard touches on
this:

"Except as indicated by the syntax [operator precedence,
parentheses] or [when function call, && || ?: , operators
intervene], the order of evaluation of subexpressions and the
order in which side effects take place are both unspecified".

So I understand the C9X rules on this to be consistent with what
we have been accustomed to in Fortran, and I would go so far as
to suggest that C9X be taken as a guide.  Certainly that draft
standard shows evidence of incorporation of some expert
experience.

In view of the fact that left-to-right evaluation has been expected
from C compilers and that most current Fortran compilers follow
that scheme, I have tended to take that into account when setting
up expressions where the compiler may apply Common
Sub-expression Evaluation.  I don't see that as anything more
than a mild suggestion as to how the compiler should treat my
code.

From: Joe Buck <jbuck@Synopsys.COM>
> I believe the example you were talking about yesterday violates
IEEE,
> since the reordering you suggested can change the value of
the result
> (by causing an intermediate (c*d) to overflow, when left to right
> evaluation would not cause an overflow):

Does IEEE require left-to-right evaluation of y = a * b * c * d;
when there are no parentheses present?

Dr. Timothy C. Prince
Consulting Engineer
Solar Turbines, a Caterpillar Company
alternate e-mail: tprince@computer.org

           To:                                              INTERNET - IBMMAIL
                                                            N4244063 - IBMMAIL

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~1998-12-20 16:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-12-16 19:37 Reassociation tprince
1998-12-16 20:08 ` Reassociation Jeffrey A Law
1998-12-17  9:24   ` Reassociation Gavin Romig-Koch
1998-12-17 10:45     ` Reassociation Stephen L Moshier
1998-12-17 19:56       ` Reassociation Jeffrey A Law
1998-12-17 20:00     ` Reassociation Jeffrey A Law
1998-12-17  8:29 Reassociation tprince
1998-12-17 20:03 ` Reassociation Jeffrey A Law
1998-12-17 14:26 REASSOCIATION tprince
1998-12-17 18:26 ` REASSOCIATION Joern Rennecke
1998-12-19  5:30 Reassociation Stephen L Moshier
1998-12-20 15:36 ` Reassociation Richard Henderson
1998-12-20 16:22   ` Reassociation Stephen L Moshier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).