public inbox for crossgcc@sourceware.org
 help / color / mirror / Atom feed
* PowerPC and Volatile
@ 2000-12-01 12:38 Roger Racine
  2000-12-01 12:55 ` Peter Barada
  2000-12-04  2:16 ` Michael Schwingen
  0 siblings, 2 replies; 9+ messages in thread
From: Roger Racine @ 2000-12-01 12:38 UTC (permalink / raw)
  To: crossgcc

We are using version 2.7.2 of GCC (Wind River supplies it as part of the 
Tornado environment.  Wind River claims it is the best version if one is 
using C).

We are seeing the following behavior on a PPC:

We have a VME board used for I/O.  When sending data, we fill a buffer on 
the board, and then write to a special location that tells the board the 
buffer is ready.  The board then sets a bit saying that it is in the 
process of sending the data.  We look for that bit to be cleared before we 
try to use the buffer again.

The problem is that we have a loop that waits for the bit to be cleared, 
and then we immediately start putting the data in the buffer, and the 
PowerPC is writing the data before the bit is cleared.  Using a VME bus 
analyzer, we see a number of reads (looking for the bit to be cleared), 
followed by "write, read, write, read" for a while, followed by the rest of 
the writes.

Naturally, the locations we are talking about are declared volatile, so the 
compiler does not optimize the code, but the PowerPC has its own 
optimization in the form of pipelining, and it seems to be causing this 
problem.

The question is, should the compiler be inserting an "eieio" instruction at 
the sequence points in the code, such as the end of the loop mentioned 
above?  This PPC instruction tells the processor to hold off its 
pipelining.  We have been inserting them in the code ourselves, but it is a 
bit of a pain to have to do it.

Another question is, does anyone know if a later version of the compiler 
has fixed this problem?

Roger Racine
Draper Laboratory

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PowerPC and Volatile
  2000-12-01 12:38 PowerPC and Volatile Roger Racine
@ 2000-12-01 12:55 ` Peter Barada
  2000-12-05  4:14   ` Roger Racine
  2000-12-04  2:16 ` Michael Schwingen
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Barada @ 2000-12-01 12:55 UTC (permalink / raw)
  To: rracine; +Cc: crossgcc

>Naturally, the locations we are talking about are declared volatile, so the 
>compiler does not optimize the code, but the PowerPC has its own 
>optimization in the form of pipelining, and it seems to be causing this 
>problem.
>
>The question is, should the compiler be inserting an "eieio" instruction at 
>the sequence points in the code, such as the end of the loop mentioned 
>above?  This PPC instruction tells the processor to hold off its 
>pipelining.  We have been inserting them in the code ourselves, but it is a 
>bit of a pain to have to do it.

No, the compiler can't automatically insert eieio instructions
since it doesn't know which pairs of *locations* need to have the eieio
synchronization.  That is something that only the hardware knows
about.  If it automatically put in an eieio before every volatile
reference, then it would uncecessarily slow down all volatile
accesses, even those between volatile locations that *don't* require it.

I understand that its a pain, but when dealing with memory mapped i/o
devices and high performance processors, its part of the price you pay
for the gain in speed.

-- 
Peter Barada                                   Peter.Barada@motorola.com
Wizard                                         781-852-2768 (direct)
WaveMark Solutions(wholly owned by Motorola)   781-270-0193 (fax)

"The real art of conversation is not only to say the right thing at the
right time, but also to leave unsaid the wrong thing at the tempting
moment."  -- Unknown

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PowerPC and Volatile
  2000-12-01 12:38 PowerPC and Volatile Roger Racine
  2000-12-01 12:55 ` Peter Barada
@ 2000-12-04  2:16 ` Michael Schwingen
  2000-12-05  1:13   ` Julien Ducourthial
  1 sibling, 1 reply; 9+ messages in thread
From: Michael Schwingen @ 2000-12-04  2:16 UTC (permalink / raw)
  To: Roger Racine; +Cc: crossgcc

On Fri, Dec 01, 2000 at 03:36:56PM -0500, Roger Racine wrote:
> Naturally, the locations we are talking about are declared volatile, so the 
> compiler does not optimize the code, but the PowerPC has its own 
> optimization in the form of pipelining, and it seems to be causing this 
> problem.
> 
> The question is, should the compiler be inserting an "eieio" instruction at 
> the sequence points in the code, such as the end of the loop mentioned 
> above?  This PPC instruction tells the processor to hold off its 
> pipelining.  We have been inserting them in the code ourselves, but it is a 
> bit of a pain to have to do it.

If this memory is in some IO space, it might be easier to set up the MMU so
that this address range is set to noncached/serialzed mode[1], ie.
read/write accesses are not re-ordered by the bus logic.

cu
Michael

[1] Not sure about the exact term on the PPC, this is from the 68040 manual,
but the PPC has the same under a different name.
-- 
Michael Schwingen, Ahornstrasse 36, 52074 Aachen

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PowerPC and Volatile
  2000-12-04  2:16 ` Michael Schwingen
@ 2000-12-05  1:13   ` Julien Ducourthial
  2000-12-05  6:06     ` Michael Schwingen
  0 siblings, 1 reply; 9+ messages in thread
From: Julien Ducourthial @ 2000-12-05  1:13 UTC (permalink / raw)
  To: Michael Schwingen; +Cc: Roger Racine, crossgcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1340 bytes --]

Michael Schwingen wrote:
On Fri, Dec 01, 2000 at 03:36:56PM -0500, Roger Racine
wrote:
> Naturally, the locations we are talking about are declared volatile,
so the
> compiler does not optimize the code, but the PowerPC has its own
> optimization in the form of pipelining, and it seems to be causing
this
> problem.
>
> The question is, should the compiler be inserting an "eieio" instruction
at
> the sequence points in the code, such as the end of the loop mentioned
> above?  This PPC instruction tells the processor to hold off
its
> pipelining.  We have been inserting them in the code ourselves,
but it is a
> bit of a pain to have to do it.
If this memory is in some IO space, it might be easier to set up the
MMU so
that this address range is set to noncached/serialzed mode[1], ie.
read/write accesses are not re-ordered by the bus logic.
cu
Michael
[1] Not sure about the exact term on the PPC, this is from the 68040
manual,
but the PPC has the same under a different name.
 
Unfortunately there is no such thing on PowerPC, even when set as non-cached
and guarded (the most conservative setting) you may get out of order accesses.
-- 
Julien Ducourthial       julien.ducourthial@detexis.thomson-csf.com 
LDB
Dépt SIA, SBU ISA          
THOMSON-CSF DETEXIS
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PowerPC and Volatile
  2000-12-01 12:55 ` Peter Barada
@ 2000-12-05  4:14   ` Roger Racine
  2000-12-05  8:01     ` Peter Barada
  0 siblings, 1 reply; 9+ messages in thread
From: Roger Racine @ 2000-12-05  4:14 UTC (permalink / raw)
  To: Peter Barada; +Cc: crossgcc

At 03:55 PM 12/1/2000 , Peter Barada wrote:

> >Naturally, the locations we are talking about are declared volatile, so the
> >compiler does not optimize the code, but the PowerPC has its own
> >optimization in the form of pipelining, and it seems to be causing this
> >problem.
> >
> >The question is, should the compiler be inserting an "eieio" instruction at
> >the sequence points in the code, such as the end of the loop mentioned
> >above?  This PPC instruction tells the processor to hold off its
> >pipelining.  We have been inserting them in the code ourselves, but it is a
> >bit of a pain to have to do it.
>
>No, the compiler can't automatically insert eieio instructions
>since it doesn't know which pairs of *locations* need to have the eieio
>synchronization.  That is something that only the hardware knows
>about.  If it automatically put in an eieio before every volatile
>reference, then it would uncecessarily slow down all volatile
>accesses, even those between volatile locations that *don't* require it.
>
>I understand that its a pain, but when dealing with memory mapped i/o
>devices and high performance processors, its part of the price you pay
>for the gain in speed.

I have a book titled "C A Reference Manual", by Samuel P. Harbison and Guy 
L. Steele Jr.  In section 4.4.5, they say the following:

"   To be more precise, ISO C introduces the notion of sequence points in C 
programs.  A sequence point exists at the completion of all expressions 
that are not part of a larger expression--that is, at the end of expression 
statements, control expressions if, switch, while, and do statements, each 
of the three control expressions in the for statement, return statement 
expressions, and initializers.  Additional sequence points are present in 
function calls immediately after all the arguments are evaluated, in the 
logical AND (&&) and OR (||) expressions, and before the conditional 
operator (?:) and the comma operator (,).
    References to and modifications of volatile objects must not be 
optimized across sequence points, although optimizations between sequence 
points are permitted."

Based on this (I do not have the ISO C Standard to check), there -are- 
known locations at which the eieio instructions could be (and should be, to 
my reading) inserted.

Based on the answers I have received, it appears that this is not a case of 
using an old version of the compiler, which has been corrected in a later 
version; I will have to do it manually.
Roger Racine
Draper Laboratory, MS 31
555 Technology Sq.
Cambridge, MA 02139, USA
617-258-2489
617-258-3939 Fax

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PowerPC and Volatile
  2000-12-05  1:13   ` Julien Ducourthial
@ 2000-12-05  6:06     ` Michael Schwingen
  2000-12-05  6:56       ` Julien Ducourthial
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Schwingen @ 2000-12-05  6:06 UTC (permalink / raw)
  To: Julien Ducourthial; +Cc: Roger Racine, crossgcc

On Tue, Dec 05, 2000 at 10:10:02AM +0100, Julien Ducourthial wrote:
> Unfortunately there is no such thing on PowerPC, even when set as non-cached and
> guarded (the most conservative setting) you may get out of order accesses.

Does this behaviour depend on the specific PPC CPU/MMU used? When working on
MPC860, setting IO spaces to non-cache/guarded (that was the term I was
looking for) worked fine without requiring eioio instructions throughout the
code.

I guess I should re-read that chapter in the manual next week ...

cu
Michael
-- 
Michael Schwingen, Ahornstrasse 36, 52074 Aachen

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PowerPC and Volatile
  2000-12-05  6:06     ` Michael Schwingen
@ 2000-12-05  6:56       ` Julien Ducourthial
  2000-12-05 10:22         ` Michael Schwingen
  0 siblings, 1 reply; 9+ messages in thread
From: Julien Ducourthial @ 2000-12-05  6:56 UTC (permalink / raw)
  To: Michael Schwingen; +Cc: Roger Racine, crossgcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1670 bytes --]

Michael Schwingen wrote:
On Tue, Dec 05, 2000 at 10:10:02AM +0100, Julien
Ducourthial wrote:
> Unfortunately there is no such thing on PowerPC, even when set as
non-cached and
> guarded (the most conservative setting) you may get out of order
accesses.
Does this behaviour depend on the specific PPC CPU/MMU used? When working
on
MPC860, setting IO spaces to non-cache/guarded (that was the term I
was
looking for) worked fine without requiring eioio instructions throughout
the
code.
I guess I should re-read that chapter in the manual next week ...
cu
Michael
--
Michael Schwingen, Ahornstrasse 36, 52074 Aachen
------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com
The exact behaviour seems to depend on the PPC model. I worked on
a driver for AMD ethernet chip, for an in-house OS. There was no eieio
in the code, but the registers were mapped with guarded and cache-inhibited
mmu attributes. It ran flawlessly on a 603e board. But when we received
newer boards with 604e cpu, the driver wasn't anymore working.
The problem was that the chip internal registers are accessed
through 2 registers (one for address, one for data), so when reading an
internal register you have to first write its adress then read the data.
Without eieio in between, the read is made ahead of the write (at least
on the 604e) and ... you do not really get the data expected.
 
-- 
Julien Ducourthial       julien.ducourthial@detexis.thomson-csf.com 
LDB
Dépt SIA, SBU ISA          
THOMSON-CSF DETEXIS
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PowerPC and Volatile
  2000-12-05  4:14   ` Roger Racine
@ 2000-12-05  8:01     ` Peter Barada
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Barada @ 2000-12-05  8:01 UTC (permalink / raw)
  To: rracine; +Cc: crossgcc

>> >Naturally, the locations we are talking about are declared volatile, so the
>> >compiler does not optimize the code, but the PowerPC has its own
>> >optimization in the form of pipelining, and it seems to be causing this
>> >problem.
>> >
>> >The question is, should the compiler be inserting an "eieio" instruction at
>> >the sequence points in the code, such as the end of the loop mentioned
>> >above?  This PPC instruction tells the processor to hold off its
>> >pipelining.  We have been inserting them in the code ourselves, but it is a
>> >bit of a pain to have to do it.
>>
>>No, the compiler can't automatically insert eieio instructions
>>since it doesn't know which pairs of *locations* need to have the eieio
>>synchronization.  That is something that only the hardware knows
>>about.  If it automatically put in an eieio before every volatile
>>reference, then it would uncecessarily slow down all volatile
>>accesses, even those between volatile locations that *don't* require it.
>>
>>I understand that its a pain, but when dealing with memory mapped i/o
>>devices and high performance processors, its part of the price you pay
>>for the gain in speed.
>
>I have a book titled "C A Reference Manual", by Samuel P. Harbison and Guy 
>L. Steele Jr.  In section 4.4.5, they say the following:
>
>"   To be more precise, ISO C introduces the notion of sequence points in C 
>programs.  A sequence point exists at the completion of all expressions 
>that are not part of a larger expression--that is, at the end of expression 
>statements, control expressions if, switch, while, and do statements, each 
>of the three control expressions in the for statement, return statement 
>expressions, and initializers.  Additional sequence points are present in 
>function calls immediately after all the arguments are evaluated, in the 
>logical AND (&&) and OR (||) expressions, and before the conditional 
>operator (?:) and the comma operator (,).
>    References to and modifications of volatile objects must not be 
>optimized across sequence points, although optimizations between sequence 
>points are permitted."
>
>Based on this (I do not have the ISO C Standard to check), there -are- 
>known locations at which the eieio instructions could be (and should be, to 
>my reading) inserted.

There's no doubt that you can modify the compiler to spit out eieio
instructions at sequencing points; but to what point? It will slow
down the code everywhere to make that tiny percentage of the code 
that needs the eieio instructions to work correctly.  That is like
using a bazooka to hunt mosquitoes; it works, but it is really inefficient. 

Its better to fix the code, not the compiler.

-- 
Peter Barada                                   Peter.Barada@motorola.com
Wizard                                         781-852-2768 (direct)
WaveMark Solutions(wholly owned by Motorola)   781-270-0193 (fax)

"The real art of conversation is not only to say the right thing at the
right time, but also to leave unsaid the wrong thing at the tempting
moment."  -- Unknown

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PowerPC and Volatile
  2000-12-05  6:56       ` Julien Ducourthial
@ 2000-12-05 10:22         ` Michael Schwingen
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Schwingen @ 2000-12-05 10:22 UTC (permalink / raw)
  To: Julien Ducourthial; +Cc: Roger Racine, crossgcc

On Tue, Dec 05, 2000 at 03:52:33PM +0100, Julien Ducourthial wrote:
> The exact behaviour seems to depend on the PPC model. I worked on a driver for AMD
> ethernet chip, for an in-house OS. There was no eieio in the code, but the registers
> were mapped with guarded and cache-inhibited mmu attributes. It ran flawlessly on a
> 603e board. But when we received newer boards with 604e cpu, the driver wasn't
> anymore working.

Ah - IIRC, the MPC860 has a 603 or 603e core (I don't have access to the
manuals until I am back in office next week), so we should be safe *for
now*.

> first write its adress then read the data. Without eieio in between, the read is made
> ahead of the write (at least on the 604e) and ... you do not really get the data
> expected.

Yup - there's enough other scenarios where re-ordered accesses can cause
problems even when all registers are directly memory-mapped, and on the
MPC860, you also have to take care of the buffer descriptors which may be in
main memory for some peripherals.

cu
Michael

------
Want more information?  See the CrossGCC FAQ, http://www.objsw.com/CrossGCC/
Want to unsubscribe? Send a note to crossgcc-unsubscribe@sourceware.cygnus.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2000-12-05 10:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-12-01 12:38 PowerPC and Volatile Roger Racine
2000-12-01 12:55 ` Peter Barada
2000-12-05  4:14   ` Roger Racine
2000-12-05  8:01     ` Peter Barada
2000-12-04  2:16 ` Michael Schwingen
2000-12-05  1:13   ` Julien Ducourthial
2000-12-05  6:06     ` Michael Schwingen
2000-12-05  6:56       ` Julien Ducourthial
2000-12-05 10:22         ` Michael Schwingen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).