public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: volatile and R/M/W operations
       [not found] <2007-11-30-23-19-02+trackit+sam@rfc1149.net>
@ 2007-12-01 23:32 ` Robert Dewar
  2007-12-02  8:59   ` Samuel Tardieu
  0 siblings, 1 reply; 9+ messages in thread
From: Robert Dewar @ 2007-12-01 23:32 UTC (permalink / raw)
  To: Samuel Tardieu; +Cc: gcc

Samuel Tardieu wrote:
> When looking at an Ada PR, I stumbled upon the equivalent of the
> following C code:
> 
> unsigned char x;
> volatile unsigned char y;
> 
> void f ()
> {
>         x |= 16;
>         y |= 32;
> }
> 
> With trunk/i686, the following code is generated (-O3 -fomit-frame-pointer):
> 
> f:
>         movzbl  y, %eax
>         orb     $16, x
>         orl     $32, %eax
>         movb    %al, y
>         ret
> 
> I cannot see a reason not to use "orb $32,y" here instead of a three
> steps read/modify/write operation. Is this only a missed optimization?
> (in which case I will open a PR)

Are you sure it is an optimization, the timing on these things is
very subtle. What evidence do you have that there is a missed
optimization here?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: volatile and R/M/W operations
  2007-12-01 23:32 ` volatile and R/M/W operations Robert Dewar
@ 2007-12-02  8:59   ` Samuel Tardieu
  2007-12-03  0:16     ` Robert Dewar
  0 siblings, 1 reply; 9+ messages in thread
From: Samuel Tardieu @ 2007-12-02  8:59 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc

On  1/12, Robert Dewar wrote:

>> I cannot see a reason not to use "orb $32,y" here instead of a three
>> steps read/modify/write operation. Is this only a missed optimization?
>> (in which case I will open a PR)
>
> Are you sure it is an optimization, the timing on these things is
> very subtle. What evidence do you have that there is a missed
> optimization here?

For this pattern (isolated setting of one bit in the middle of a byte at
a random memory location), this is the best code on this platform AFAIK.

As an evidence, if you mark neither variable as volatile GCC generates
with -O3 -fomit-frame-pointer:

f:
        orb     $16, x
        orb     $32, y
        ret

And I sure expect that GCC didn't choose to generate worst code when
I *removed* the volatile constraint :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: volatile and R/M/W operations
  2007-12-02  8:59   ` Samuel Tardieu
@ 2007-12-03  0:16     ` Robert Dewar
  2007-12-03 13:24       ` Richard Kenner
  0 siblings, 1 reply; 9+ messages in thread
From: Robert Dewar @ 2007-12-03  0:16 UTC (permalink / raw)
  To: Samuel Tardieu; +Cc: gcc

Samuel Tardieu wrote:

> For this pattern (isolated setting of one bit in the middle of a byte at
> a random memory location), this is the best code on this platform AFAIK.
> 
> As an evidence, if you mark neither variable as volatile GCC generates
> with -O3 -fomit-frame-pointer:
> 
> f:
>         orb     $16, x
>         orb     $32, y
>         ret
> 
> And I sure expect that GCC didn't choose to generate worst code when
> I *removed* the volatile constraint :)

OK, sounds reasonable, but then I don't understand the logic behind
avoiding this instruction sequence for the volatile case, this is
two accesses at the bus level so what's the difference? I think
on earlier pentiums these instructions were supposed to be avoided
but of course this may have changed.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: volatile and R/M/W operations
  2007-12-03  0:16     ` Robert Dewar
@ 2007-12-03 13:24       ` Richard Kenner
  2007-12-03 14:10         ` Robert Dewar
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Kenner @ 2007-12-03 13:24 UTC (permalink / raw)
  To: dewar; +Cc: gcc, sam

> OK, sounds reasonable, but then I don't understand the logic behind
> avoiding this instruction sequence for the volatile case, this is
> two accesses at the bus level so what's the difference? 

There's no difference from that perspective.  The logic behind what's
generated is that instead of trying to do a case-by-case analysis of
what instruction combinations might actually be valid for volatile
memory (which could potentially be target-specific), GCC takes the
conservative approach of simply disabling all but trivial combinations
for volatile access.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: volatile and R/M/W operations
  2007-12-03 13:24       ` Richard Kenner
@ 2007-12-03 14:10         ` Robert Dewar
  2007-12-03 16:05           ` Richard Kenner
  0 siblings, 1 reply; 9+ messages in thread
From: Robert Dewar @ 2007-12-03 14:10 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc, sam

Richard Kenner wrote:
>> OK, sounds reasonable, but then I don't understand the logic behind
>> avoiding this instruction sequence for the volatile case, this is
>> two accesses at the bus level so what's the difference? 
> 
> There's no difference from that perspective.  The logic behind what's
> generated is that instead of trying to do a case-by-case analysis of
> what instruction combinations might actually be valid for volatile
> memory (which could potentially be target-specific), GCC takes the
> conservative approach of simply disabling all but trivial combinations
> for volatile access.

Right, but it would seem this is a good canididate for combination. This
is especially true since often Volatile is used with the sense of Atomic
in Ada, and it is not a bad idea to combine these in practice, giving an
atomic update (right, nothing in the language requires it, but it is
definitely useful!)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: volatile and R/M/W operations
  2007-12-03 14:10         ` Robert Dewar
@ 2007-12-03 16:05           ` Richard Kenner
  2007-12-03 17:28             ` Robert Dewar
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Kenner @ 2007-12-03 16:05 UTC (permalink / raw)
  To: dewar; +Cc: gcc, sam

> Right, but it would seem this is a good canididate for combination. This
> is especially true since often Volatile is used with the sense of Atomic
> in Ada, and it is not a bad idea to combine these in practice, giving an
> atomic update (right, nothing in the language requires it, but it is
> definitely useful!)

I don't disagree that "this" is a good candidate for combination, but
one problem is that by the time you're at that level, you don't easily have
the source correspondance you want.

E.g.,

	y |= 2;

and

	t1 = y | 2;
	y = t1;

are very hard to tell apart at the RTL level.  Though it's clear that
a single instruction might best match the expect semantics of the former,
it's a lot less clear that it would for the latter.

From a legacy perspective, it's dangerous to muck around much in this area.

As to Ada's Atomic, it's just implementation convenience that it's mapped
to GCC's volatile.  Most things that GCC's volatile implies aren't needed
for Ada's atomic (and it may even be the case that most of them aren't even
needed for Volatile in Ada).

One approach here would be to separate the properties we now consider part
of "volatile" (e.g., "can't remove dead load", "can't change access size",
"must keep same number of loads", etc.) into separate properties and test
those in places where we now test the "volatile" attribute.  That would
be a fairly straightforward but large and pervasive change.  It's not
clear it'd be worth the effort, but I'm curious what others think.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: volatile and R/M/W operations
  2007-12-03 16:05           ` Richard Kenner
@ 2007-12-03 17:28             ` Robert Dewar
  2007-12-03 17:35               ` Richard Kenner
  0 siblings, 1 reply; 9+ messages in thread
From: Robert Dewar @ 2007-12-03 17:28 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc, sam

Richard Kenner wrote:
>> Right, but it would seem this is a good canididate for combination. This
>> is especially true since often Volatile is used with the sense of Atomic
>> in Ada, and it is not a bad idea to combine these in practice, giving an
>> atomic update (right, nothing in the language requires it, but it is
>> definitely useful!)
> 
> I don't disagree that "this" is a good candidate for combination, but
> one problem is that by the time you're at that level, you don't easily have
> the source correspondance you want.
> 
> E.g.,
> 
> 	y |= 2;
> 
> and
> 
> 	t1 = y | 2;
> 	y = t1;
> 
> are very hard to tell apart at the RTL level.  Though it's clear that
> a single instruction might best match the expect semantics of the former,
> it's a lot less clear that it would for the latter.

I think it would still be OK for the latter, why not?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: volatile and R/M/W operations
  2007-12-03 17:28             ` Robert Dewar
@ 2007-12-03 17:35               ` Richard Kenner
  2007-12-03 20:44                 ` Robert Dewar
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Kenner @ 2007-12-03 17:35 UTC (permalink / raw)
  To: dewar; +Cc: gcc, sam

> > 	t1 = y | 2;
> > 	y = t1;
> > 
> > are very hard to tell apart at the RTL level.  Though it's clear that
> > a single instruction might best match the expect semantics of the former,
> > it's a lot less clear that it would for the latter.
> 
> I think it would still be OK for the latter, why not?

There was certainly a time when it would not, because a R/M/W cycle on
a device register meant a different thing that a read followed by a write
and the latter is more clearly what the above is supposed to represent.

Whether there is still such hardware around is another question, but
the point is that whether you or I THINK it would be OK really isn't
the issue when talking about legacy code.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: volatile and R/M/W operations
  2007-12-03 17:35               ` Richard Kenner
@ 2007-12-03 20:44                 ` Robert Dewar
  0 siblings, 0 replies; 9+ messages in thread
From: Robert Dewar @ 2007-12-03 20:44 UTC (permalink / raw)
  To: Richard Kenner; +Cc: gcc, sam

Richard Kenner wrote:
>>> 	t1 = y | 2;
>>> 	y = t1;
>>>
>>> are very hard to tell apart at the RTL level.  Though it's clear that
>>> a single instruction might best match the expect semantics of the former,
>>> it's a lot less clear that it would for the latter.
>> I think it would still be OK for the latter, why not?
> 
> There was certainly a time when it would not, because a R/M/W cycle on
> a device register meant a different thing that a read followed by a write
> and the latter is more clearly what the above is supposed to represent.
> 
> Whether there is still such hardware around is another question, but
> the point is that whether you or I THINK it would be OK really isn't
> the issue when talking about legacy code.

What is interesting is whether this translation is appropriate for
modern Intel architecture chips, and as far as I know the answer
is yes.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2007-12-03 20:44 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <2007-11-30-23-19-02+trackit+sam@rfc1149.net>
2007-12-01 23:32 ` volatile and R/M/W operations Robert Dewar
2007-12-02  8:59   ` Samuel Tardieu
2007-12-03  0:16     ` Robert Dewar
2007-12-03 13:24       ` Richard Kenner
2007-12-03 14:10         ` Robert Dewar
2007-12-03 16:05           ` Richard Kenner
2007-12-03 17:28             ` Robert Dewar
2007-12-03 17:35               ` Richard Kenner
2007-12-03 20:44                 ` Robert Dewar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).