public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [cxx-mem-model] bitfield tests
@ 2011-03-29 17:28 Aldy Hernandez
  2011-03-30 10:50 ` Richard Guenther
  0 siblings, 1 reply; 18+ messages in thread
From: Aldy Hernandez @ 2011-03-29 17:28 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 1743 bytes --]

[Language lawyers, please correct me if I have mis-interpreted the 
upcoming standard in any way.]

In the C++ memory model, contiguous bitfields comprise a single memory 
location, so it's fair game to bit twiddle them when setting them.  For 
example:

	struct {
		unsigned int a : 4;
		unsigned int b : 4;
		unsigned int c : 4;
	};

In the above example, you can touch <b> and <c> while setting <a>.  No 
race there.

However, non contiguous bitfields are a different story:

	struct {
		unsigned int a : 4;
		char b;
		unsigned int c : 6;
	};

Here we have 3 distinct memory locations, so you can't touch <b> or <c> 
while setting <a>.  No bit twiddling allowed.

Similarly for bitfields separated by a zero-length bitfield:

	struct {
		unsigned int a : 4;
		int : 0;
		unsigned int c : 6;
	};

In the above example, <a> and <c> are distinct memory locations.

Also, a structure/union boundary will also separate previously 
contiguous bit sequences:

	struct {
		unsigned int a : 4;
		struct { unsigned int b : 4 } BBB;
		unsigned int c : 4;
	};

Here we have 3 distinct memory locations, so again, we can't clobber <b> 
or <c> while setting <a>.

The patch below adds a non-contiguous bit test (bitfields-2.C) which 
passes on x86, but upon assembly inspection, fails on PPC64, s390, and 
Alpha.  These 3 architectures bit-twiddle their way out of the problem.

There is also a similar test already in the testsuite (bitfields.C) 
which is similar except one field is a volatile.  This test fails on 
x86-64 as well and is the subject of PR48128 which Jakub is currently 
tackling.

As soon as Jakub finishes with PR48128, I will be working on getting 
these bitfield tests working in a C++ memory model fashion.

Committing to branch.

[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 2595 bytes --]

	* gcc.dg/memmodel/subfields.c (set_a): Set noinline attribute.
	* g++.dg/memmodel/bitfields.C (set_a): Same.
	* g++.dg/memmodel/bitfields-2.C: New.

Index: gcc.dg/memmodel/subfields.c
===================================================================
--- gcc.dg/memmodel/subfields.c	(revision 170937)
+++ gcc.dg/memmodel/subfields.c	(working copy)
@@ -20,6 +20,7 @@ struct test_struct {
    not affect any of the other fields in the structure.  An improper
    implementation may load an entire word, change the 8 bits for field
    'a' and write the entire word back out. */
+__attribute__((noinline))
 void set_a(char x)
 {
   var.a = x;
Index: g++.dg/memmodel/bitfields-2.C
===================================================================
--- g++.dg/memmodel/bitfields-2.C	(revision 0)
+++ g++.dg/memmodel/bitfields-2.C	(revision 0)
@@ -0,0 +1,71 @@
+/* { dg-do link } */
+/* { dg-options "-O2 --param allow-load-data-races=0 --param allow-store-data-races=0" } */
+/* { dg-final { memmodel-gdb-test } } */
+
+/* Test that setting <var.a> does not touch either <var.b> or <var.c>.
+   In the C++ memory model, non contiguous bitfields ("a" and "c"
+   here) should be considered as distinct memory locations, so we
+   can't use bit twiddling to set either one.  */
+
+#include <stdio.h>
+#include "memmodel.h"
+
+#define CONSTA 12
+
+static int global;
+struct S
+{
+  unsigned int a : 4;
+  unsigned char b;
+  unsigned int c : 6;
+} var;
+
+__attribute__((noinline))
+void set_a()
+{
+  var.a = CONSTA;
+}
+
+void memmodel_other_threads()
+{
+  ++global;
+  var.b = global;
+  var.c = global;
+}
+
+int memmodel_step_verify()
+{
+  int ret = 0;
+  if (var.b != global)
+    {
+      printf ("FAIL: Unexpected value: var.b is %d, should be %d\n",
+	      var.b, global);
+      ret = 1;
+    }
+  if (var.c != global)
+    {
+      printf ("FAIL: Unexpected value: var.c is %d, should be %d\n",
+	      var.c, global);
+      ret = 1;
+    }
+  return ret;
+}
+
+int memmodel_final_verify()
+{
+  int ret = memmodel_step_verify();
+  if (var.a != CONSTA)
+    {
+      printf ("FAIL: Unexpected value: var.a is %d, should be %d\n",
+	      var.a, CONSTA);
+      ret = 1;
+    }
+  return ret;
+}
+
+int main()
+{
+  set_a();
+  memmodel_done();
+  return 0;
+}
Index: g++.dg/memmodel/bitfields.C
===================================================================
--- g++.dg/memmodel/bitfields.C	(revision 171248)
+++ g++.dg/memmodel/bitfields.C	(working copy)
@@ -23,6 +23,7 @@ struct S
   unsigned int c : 6;
 } var;
 
+__attribute__((noinline))
 void set_a()
 {
   var.a = CONSTA;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-03-29 17:28 [cxx-mem-model] bitfield tests Aldy Hernandez
@ 2011-03-30 10:50 ` Richard Guenther
  2011-03-30 14:24   ` Aldy Hernandez
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2011-03-30 10:50 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches, Jakub Jelinek

On Tue, Mar 29, 2011 at 7:00 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> [Language lawyers, please correct me if I have mis-interpreted the upcoming
> standard in any way.]
>
> In the C++ memory model, contiguous bitfields comprise a single memory
> location, so it's fair game to bit twiddle them when setting them.  For
> example:
>
>        struct {
>                unsigned int a : 4;
>                unsigned int b : 4;
>                unsigned int c : 4;
>        };
>
> In the above example, you can touch <b> and <c> while setting <a>.  No race
> there.
>
> However, non contiguous bitfields are a different story:
>
>        struct {
>                unsigned int a : 4;
>                char b;
>                unsigned int c : 6;
>        };
>
> Here we have 3 distinct memory locations, so you can't touch <b> or <c>
> while setting <a>.  No bit twiddling allowed.
>
> Similarly for bitfields separated by a zero-length bitfield:
>
>        struct {
>                unsigned int a : 4;
>                int : 0;
>                unsigned int c : 6;
>        };
>
> In the above example, <a> and <c> are distinct memory locations.
>
> Also, a structure/union boundary will also separate previously contiguous
> bit sequences:
>
>        struct {
>                unsigned int a : 4;
>                struct { unsigned int b : 4 } BBB;
>                unsigned int c : 4;
>        };
>
> Here we have 3 distinct memory locations, so again, we can't clobber <b> or
> <c> while setting <a>.
>
> The patch below adds a non-contiguous bit test (bitfields-2.C) which passes
> on x86, but upon assembly inspection, fails on PPC64, s390, and Alpha.
>  These 3 architectures bit-twiddle their way out of the problem.

The memory model is not implementable on strict-alignment targets
that do not have a byte store operation.  But we previously said that ;)

Also consider global vars

char a;
char b;

accessing them on strict-align targets may access adjacent globals
(that's a problem anyway, also with alias analysis).

Richard.

> There is also a similar test already in the testsuite (bitfields.C) which is
> similar except one field is a volatile.  This test fails on x86-64 as well
> and is the subject of PR48128 which Jakub is currently tackling.
>
> As soon as Jakub finishes with PR48128, I will be working on getting these
> bitfield tests working in a C++ memory model fashion.
>
> Committing to branch.
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-03-30 10:50 ` Richard Guenther
@ 2011-03-30 14:24   ` Aldy Hernandez
  2011-03-30 14:25     ` Richard Guenther
  2011-03-30 14:39     ` Michael Matz
  0 siblings, 2 replies; 18+ messages in thread
From: Aldy Hernandez @ 2011-03-30 14:24 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 1593 bytes --]


> The memory model is not implementable on strict-alignment targets
> that do not have a byte store operation.  But we previously said that ;)

Yes.  I think we should issue an error when we have such a target and 
the user tries -fmemory-model=c++0x.  However, how many strict-alignment 
targets are not byte addressable nowadays?

> Also consider global vars
>
> char a;
> char b;
>
> accessing them on strict-align targets may access adjacent globals
> (that's a problem anyway, also with alias analysis).

Good point.  I am adding a test to that effect (see attached patch).

BTW, I assume you mean strict-align targets WITHOUT byte-addressability 
as above.  I have spot-checked your scenario on a handful of important 
targets that have strict alignment, and all of them work without 
touching adjacent global vars:

	arm-elf		OK
	sparc-linux	OK
	ia64-linux	OK
	alpha-linux	OK, but only with -mbwx (byte addressability)

rth tells me that we shouldn't worry about ancient non-byte addressable 
Alphas, so the last isn't an issue.

So... do you have any important targets in mind, because I don't see 
this being a problem for most targets?  As can be expected, I am only 
interested in x86*, powerpc*, and s390, especially since a cursory 
glance on other important targets didn't exhibit any problems.  However, 
given my target bias, I am willing to look into any important targets 
that are problematic (I'm hoping none :)).

Let me know if you see anything else, and please take a quick peek at 
the attached patch below, which I will be committing shortly.

As usual, thanks.
Aldy

[-- Attachment #2: curr --]
[-- Type: text/plain, Size: 1199 bytes --]

Index: testsuite/gcc.dg/memmodel/strict-align-global.c
===================================================================
--- testsuite/gcc.dg/memmodel/strict-align-global.c	(revision 0)
+++ testsuite/gcc.dg/memmodel/strict-align-global.c	(revision 0)
@@ -0,0 +1,46 @@
+/* { dg-do link } */
+/* { dg-options "-O2 --param allow-packed-store-data-races=0" } */
+/* { dg-final { memmodel-gdb-test } } */
+
+#include <stdio.h>
+#include "memmodel.h"
+
+/* This test verifies writes to globals do not write to adjacent
+   globals.  This mostly happens on strict-align targets that are not
+   byte addressable (old Alphas, etc).  */
+
+char a = 0;
+char b = 77;
+
+void memmodel_other_threads() 
+{
+}
+
+int memmodel_step_verify()
+{
+  if (b != 77)
+    {
+      printf("FAIL: Unexpected value.  <b> is %d, should be 77\n", b);
+      return 1;
+    }
+  return 0;
+}
+
+/* Verify that every variable has the correct value.  */
+int memmodel_final_verify()
+{
+  int ret = memmodel_step_verify ();
+  if (a != 66)
+    {
+      printf("FAIL: Unexpected value.  <a> is %d, should be 66\n", a);
+      return 1;
+    }
+  return ret;
+}
+
+int main ()
+{
+  a = 66;
+  memmodel_done();
+  return 0;
+}

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-03-30 14:24   ` Aldy Hernandez
@ 2011-03-30 14:25     ` Richard Guenther
  2011-03-30 14:41       ` Aldy Hernandez
  2011-03-31 14:58       ` Jeff Law
  2011-03-30 14:39     ` Michael Matz
  1 sibling, 2 replies; 18+ messages in thread
From: Richard Guenther @ 2011-03-30 14:25 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches, Jakub Jelinek

On Wed, Mar 30, 2011 at 4:11 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> The memory model is not implementable on strict-alignment targets
>> that do not have a byte store operation.  But we previously said that ;)
>
> Yes.  I think we should issue an error when we have such a target and the
> user tries -fmemory-model=c++0x.  However, how many strict-alignment targets
> are not byte addressable nowadays?
>
>> Also consider global vars
>>
>> char a;
>> char b;
>>
>> accessing them on strict-align targets may access adjacent globals
>> (that's a problem anyway, also with alias analysis).
>
> Good point.  I am adding a test to that effect (see attached patch).
>
> BTW, I assume you mean strict-align targets WITHOUT byte-addressability as
> above.  I have spot-checked your scenario on a handful of important targets
> that have strict alignment, and all of them work without touching adjacent
> global vars:
>
>        arm-elf         OK
>        sparc-linux     OK
>        ia64-linux      OK
>        alpha-linux     OK, but only with -mbwx (byte addressability)
>
> rth tells me that we shouldn't worry about ancient non-byte addressable
> Alphas, so the last isn't an issue.
>
> So... do you have any important targets in mind, because I don't see this
> being a problem for most targets?  As can be expected, I am only interested
> in x86*, powerpc*, and s390, especially since a cursory glance on other
> important targets didn't exhibit any problems.  However, given my target
> bias, I am willing to look into any important targets that are problematic
> (I'm hoping none :)).

Well, I'm not sure that strict-align targets that provide byte access do
not simply hide the issue inside the CPU (thus, perform the read-modify-write
there and do not guarantee any atomicity unless you ask for it).  It might
be even worse - targets might not even guarantee this for shared cache-lines
(for non-ccNUMA architectures).  But I'm no expert here, but certainly
every possible weird CPU architecture has been implemented.

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-03-30 14:24   ` Aldy Hernandez
  2011-03-30 14:25     ` Richard Guenther
@ 2011-03-30 14:39     ` Michael Matz
  1 sibling, 0 replies; 18+ messages in thread
From: Michael Matz @ 2011-03-30 14:39 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: Richard Guenther, gcc-patches, Jakub Jelinek

Hi,

On Wed, 30 Mar 2011, Aldy Hernandez wrote:

> 
> > The memory model is not implementable on strict-alignment targets
> > that do not have a byte store operation.  But we previously said that ;)
> 
> Yes.  I think we should issue an error when we have such a target and the user
> tries -fmemory-model=c++0x.  However, how many strict-alignment targets are
> not byte addressable nowadays?

Consider cache aliasing, where the unit of coherence (absent using atomic 
instructions) is for instance 64 bytes.  I'm not sure how the mem-model 
could be implemented without generally falling back to atomics.

Or CPU internal write buffers that could (again if there are just normal 
writes, not atomics) reorder or merge write requests.  I think also that 
would destroy guarantees that the cxx-mem-model tries to provide.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-03-30 14:25     ` Richard Guenther
@ 2011-03-30 14:41       ` Aldy Hernandez
  2011-03-30 14:43         ` Richard Guenther
  2011-03-31 14:58       ` Jeff Law
  1 sibling, 1 reply; 18+ messages in thread
From: Aldy Hernandez @ 2011-03-30 14:41 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jakub Jelinek


>> So... do you have any important targets in mind, because I don't see this
>> being a problem for most targets?  As can be expected, I am only interested
>> in x86*, powerpc*, and s390, especially since a cursory glance on other
>> important targets didn't exhibit any problems.  However, given my target
>> bias, I am willing to look into any important targets that are problematic
>> (I'm hoping none :)).
>
> Well, I'm not sure that strict-align targets that provide byte access do
> not simply hide the issue inside the CPU (thus, perform the read-modify-write
> there and do not guarantee any atomicity unless you ask for it).  It might
> be even worse - targets might not even guarantee this for shared cache-lines
> (for non-ccNUMA architectures).  But I'm no expert here, but certainly
> every possible weird CPU architecture has been implemented.

Whoops, sorry I missed your off-list followup from yesterday (I'm 
reading mail sequentially :)):

 > Richard Guenther said:
 > strict-align targets will end up doing read-modify-write operations on
 > word-size even when accessing single bytes.  Note that some CPUs
 > have byte store operations but they usually are not guaranteed to
 > be "atomic" (thus, they simply do the read-modify-write in the CPU).
 > I am not aware of any strict-align CPU that can do atomic byte stores.
 >
 > Obvious problem when for example having multiple non-word-size
 > global vars (unless you force them to word-alignment).

I was not aware of how this played out internally.  This is certainly a 
problem.  I will hunt down hardware for at least arm, sparc, and ia64, 
and investigate.  But it may be that the only option will be to disallow 
the C++ memory model on strictly aligned hardware, or perhaps force 
word-alignment.

Is forcing word-alignment too big of a hammer, or will the users for 
these architectures be content with having no support for the C++0x 
memory model?

Aldy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-03-30 14:41       ` Aldy Hernandez
@ 2011-03-30 14:43         ` Richard Guenther
  2011-03-30 15:13           ` Mike Stump
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2011-03-30 14:43 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc-patches, Jakub Jelinek

On Wed, Mar 30, 2011 at 4:26 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>>> So... do you have any important targets in mind, because I don't see this
>>> being a problem for most targets?  As can be expected, I am only
>>> interested
>>> in x86*, powerpc*, and s390, especially since a cursory glance on other
>>> important targets didn't exhibit any problems.  However, given my target
>>> bias, I am willing to look into any important targets that are
>>> problematic
>>> (I'm hoping none :)).
>>
>> Well, I'm not sure that strict-align targets that provide byte access do
>> not simply hide the issue inside the CPU (thus, perform the
>> read-modify-write
>> there and do not guarantee any atomicity unless you ask for it).  It might
>> be even worse - targets might not even guarantee this for shared
>> cache-lines
>> (for non-ccNUMA architectures).  But I'm no expert here, but certainly
>> every possible weird CPU architecture has been implemented.
>
> Whoops, sorry I missed your off-list followup from yesterday (I'm reading
> mail sequentially :)):
>
>> Richard Guenther said:
>> strict-align targets will end up doing read-modify-write operations on
>> word-size even when accessing single bytes.  Note that some CPUs
>> have byte store operations but they usually are not guaranteed to
>> be "atomic" (thus, they simply do the read-modify-write in the CPU).
>> I am not aware of any strict-align CPU that can do atomic byte stores.
>>
>> Obvious problem when for example having multiple non-word-size
>> global vars (unless you force them to word-alignment).
>
> I was not aware of how this played out internally.  This is certainly a
> problem.  I will hunt down hardware for at least arm, sparc, and ia64, and
> investigate.  But it may be that the only option will be to disallow the C++
> memory model on strictly aligned hardware, or perhaps force word-alignment.
>
> Is forcing word-alignment too big of a hammer, or will the users for these
> architectures be content with having no support for the C++0x memory model?

I think a memory model that cannot be reasonably (read: also fast) implemented
on all HW is screwed from the start and we should simply ditch it.  Which
is because nobody will use it as you cannot rely on it when writing
portable programs or it will be hell slow.

Richard.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-03-30 14:43         ` Richard Guenther
@ 2011-03-30 15:13           ` Mike Stump
  0 siblings, 0 replies; 18+ messages in thread
From: Mike Stump @ 2011-03-30 15:13 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Aldy Hernandez, gcc-patches, Jakub Jelinek

On Mar 30, 2011, at 7:40 AM, Richard Guenther wrote:
>> Is forcing word-alignment too big of a hammer, or will the users for these
>> architectures be content with having no support for the C++0x memory model?
> 
> I think a memory model that cannot be reasonably (read: also fast) implemented
> on all HW is screwed from the start and we should simply ditch it.  Which
> is because nobody will use it as you cannot rely on it when writing
> portable programs or it will be hell slow.

I agree 100%.  If the standards people can't write a decent standard, they ought not write it.  I torpedoed someone refining volatile, which would have been nice to have, because people were laying tracks down the wrong way.  Nuke em from orbit I say.  Now, I'm sure we have it all wrong and the standard is entirely reasonable...  right?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-03-30 14:25     ` Richard Guenther
  2011-03-30 14:41       ` Aldy Hernandez
@ 2011-03-31 14:58       ` Jeff Law
  2011-03-31 15:35         ` Richard Guenther
  1 sibling, 1 reply; 18+ messages in thread
From: Jeff Law @ 2011-03-31 14:58 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Aldy Hernandez, gcc-patches, Jakub Jelinek

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/30/11 08:19, Richard Guenther wrote:

> 
> Well, I'm not sure that strict-align targets that provide byte access do
> not simply hide the issue inside the CPU (thus, perform the read-modify-write
> there and do not guarantee any atomicity unless you ask for it).
Certainly some do this internally, but that's clearly out of our
control.  However, some really do sub-word accesses.

I even vaguely remember this being controllable by bits in page table
entries on one architecture.  You could set the bit which meant if I ask
for a byte access, then do it byte-wise, otherwise the processor would
do a read-modify-write.  Clearly this was meant to make it easier for
dealing with memory mapped devices.

Jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNlJQWAAoJEBRtltQi2kC7t0IIAJTpXGIyWcIpWqk26ofieuLc
T7PIBagNARbqEU2NwzgjeUyH4HMhCgwnAX8T4WXg2JJRXsZwxQPmKfk0x3mn6yBV
z60TISwtx53LEnqbLQG5FIU4QLyOcBOGuAFabyVcsT07tKE/wmGjDBkypbsBhUuw
ZFNEY7jausQGkaRy1ObxL4VWejk51XvcqNU2ReqjQJUvbS9UlpTNoopMixORG6Hb
qb4LF/Fr9S9cckB3oBxy4pZrdEd7/rlAroMoRXw2JwEbGNyfc9EACKtcXbopakCu
XnPxjsf4eVYNDl5jSf3r8w70fX5vqUimyfVeQqi49IcImqXGlfd/8US1ptOgZQE=
=WMAs
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-03-31 14:58       ` Jeff Law
@ 2011-03-31 15:35         ` Richard Guenther
  2011-04-01 16:24           ` Richard Henderson
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Guenther @ 2011-03-31 15:35 UTC (permalink / raw)
  To: Jeff Law; +Cc: Aldy Hernandez, gcc-patches, Jakub Jelinek

On Thu, Mar 31, 2011 at 4:47 PM, Jeff Law <law@redhat.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 03/30/11 08:19, Richard Guenther wrote:
>
>>
>> Well, I'm not sure that strict-align targets that provide byte access do
>> not simply hide the issue inside the CPU (thus, perform the read-modify-write
>> there and do not guarantee any atomicity unless you ask for it).
> Certainly some do this internally, but that's clearly out of our
> control.

Sure.  My argument is that the memory model which guarantees
this kind of things for _any_ memory access is fundamentally flawed.
They should have simply required annotating objects which should
behave that way (and then only behave that way "per object", not
for any concurrent field accesses).

Richard.

> However, some really do sub-word accesses.
>
> I even vaguely remember this being controllable by bits in page table
> entries on one architecture.  You could set the bit which meant if I ask
> for a byte access, then do it byte-wise, otherwise the processor would
> do a read-modify-write.  Clearly this was meant to make it easier for
> dealing with memory mapped devices.
>
> Jeff
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/
>
> iQEcBAEBAgAGBQJNlJQWAAoJEBRtltQi2kC7t0IIAJTpXGIyWcIpWqk26ofieuLc
> T7PIBagNARbqEU2NwzgjeUyH4HMhCgwnAX8T4WXg2JJRXsZwxQPmKfk0x3mn6yBV
> z60TISwtx53LEnqbLQG5FIU4QLyOcBOGuAFabyVcsT07tKE/wmGjDBkypbsBhUuw
> ZFNEY7jausQGkaRy1ObxL4VWejk51XvcqNU2ReqjQJUvbS9UlpTNoopMixORG6Hb
> qb4LF/Fr9S9cckB3oBxy4pZrdEd7/rlAroMoRXw2JwEbGNyfc9EACKtcXbopakCu
> XnPxjsf4eVYNDl5jSf3r8w70fX5vqUimyfVeQqi49IcImqXGlfd/8US1ptOgZQE=
> =WMAs
> -----END PGP SIGNATURE-----
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-03-31 15:35         ` Richard Guenther
@ 2011-04-01 16:24           ` Richard Henderson
  2011-04-01 19:42             ` Andrew Pinski
  2011-04-02  7:56             ` Richard Guenther
  0 siblings, 2 replies; 18+ messages in thread
From: Richard Henderson @ 2011-04-01 16:24 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jeff Law, Aldy Hernandez, gcc-patches, Jakub Jelinek

On 03/31/2011 08:28 AM, Richard Guenther wrote:
>>> Well, I'm not sure that strict-align targets that provide byte access do
>>> not simply hide the issue inside the CPU (thus, perform the read-modify-write
>>> there and do not guarantee any atomicity unless you ask for it).
>> Certainly some do this internally, but that's clearly out of our
>> control.
> 
> Sure.  My argument is that the memory model which guarantees
> this kind of things for _any_ memory access is fundamentally flawed.
> They should have simply required annotating objects which should
> behave that way (and then only behave that way "per object", not
> for any concurrent field accesses).

(0) Let's limit our discussion to cpus that are actually put into SMP systems,
    and have been manufactured in the last decade.

(1) Do we agree that all such cpus have user-level store insns with byte
    granularity.  Honestly the only non-microcontroler I ever heard of 
    without this was the original Alpha.  Which is excluded per (0).

(2) Do we agree that all such cpus have on-chip caches?

(3) Let us at this point limit our discussion to cacheable, i.e. non-I/O,
    memory.  I believe we can agree that all sorts of system-dependent stuff
    happens in memory-mapped registers.

(4) Do we agree that all such cpus transfer entire cachelines to and fro
    the memory bus?  And further that they simultaneously transfer a 
    modification mask as part of their cache coherency protocol?

(5) Do we agree that all such cpus use a byte-granular modification mask?

I'm guessing that you don't actually agree on point (5), but ... honestly,
please name the offender because I can't think of one.  For the mainstream
processors we really care about, I think every one of them Does The Right Thing.



r~

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-04-01 16:24           ` Richard Henderson
@ 2011-04-01 19:42             ` Andrew Pinski
  2011-04-02  7:56             ` Richard Guenther
  1 sibling, 0 replies; 18+ messages in thread
From: Andrew Pinski @ 2011-04-01 19:42 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Richard Guenther, Jeff Law, Aldy Hernandez, gcc-patches, Jakub Jelinek

On Fri, Apr 1, 2011 at 9:24 AM, Richard Henderson <rth@redhat.com> wrote:
> (1) Do we agree that all such cpus have user-level store insns with byte
>    granularity.  Honestly the only non-microcontroler I ever heard of
>    without this was the original Alpha.  Which is excluded per (0).

And SPU which is excluded per (0) based on it is not a SMP but rather
AMP as it does not share memory.

-- Pinski

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-04-01 16:24           ` Richard Henderson
  2011-04-01 19:42             ` Andrew Pinski
@ 2011-04-02  7:56             ` Richard Guenther
  2011-04-04 12:56               ` Aldy Hernandez
                                 ` (2 more replies)
  1 sibling, 3 replies; 18+ messages in thread
From: Richard Guenther @ 2011-04-02  7:56 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Jeff Law, Aldy Hernandez, gcc-patches, Jakub Jelinek

On Fri, Apr 1, 2011 at 6:24 PM, Richard Henderson <rth@redhat.com> wrote:
> On 03/31/2011 08:28 AM, Richard Guenther wrote:
>>>> Well, I'm not sure that strict-align targets that provide byte access do
>>>> not simply hide the issue inside the CPU (thus, perform the read-modify-write
>>>> there and do not guarantee any atomicity unless you ask for it).
>>> Certainly some do this internally, but that's clearly out of our
>>> control.
>>
>> Sure.  My argument is that the memory model which guarantees
>> this kind of things for _any_ memory access is fundamentally flawed.
>> They should have simply required annotating objects which should
>> behave that way (and then only behave that way "per object", not
>> for any concurrent field accesses).
>
> (0) Let's limit our discussion to cpus that are actually put into SMP systems,
>    and have been manufactured in the last decade.
>
> (1) Do we agree that all such cpus have user-level store insns with byte
>    granularity.  Honestly the only non-microcontroler I ever heard of
>    without this was the original Alpha.  Which is excluded per (0).
>
> (2) Do we agree that all such cpus have on-chip caches?
>
> (3) Let us at this point limit our discussion to cacheable, i.e. non-I/O,
>    memory.  I believe we can agree that all sorts of system-dependent stuff
>    happens in memory-mapped registers.
>
> (4) Do we agree that all such cpus transfer entire cachelines to and fro
>    the memory bus?  And further that they simultaneously transfer a
>    modification mask as part of their cache coherency protocol?
>
> (5) Do we agree that all such cpus use a byte-granular modification mask?
>
> I'm guessing that you don't actually agree on point (5), but ... honestly,
> please name the offender because I can't think of one.  For the mainstream
> processors we really care about, I think every one of them Does The Right Thing.

Yes, we don't agree on (5).  And I can't name a CPU, but I was just guessing
that strict alignment CPUs would have such requirement to also make their
store queues simpler (no need for such mask).

Now, as of (0) I might agree to disregard the original Alpha, but as the
embedded world moves to SMP I'm not sure we can disregard
non-cache coherent NUMA setups or even CPUs without a byte store.

But well, I guess the thing I don't like about the standard is that it makes
people that have started to be somewhat aware about threading issues
_less_ aware of them by providing some "false" safety to them.  It
really smells like a standard designed for a very high-level language
where people don't have to think instead of a standard suitable for a
C family language.

Richard.

>
>
> r~
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-04-02  7:56             ` Richard Guenther
@ 2011-04-04 12:56               ` Aldy Hernandez
  2011-04-04 12:58               ` Aldy Hernandez
  2011-04-04 18:06               ` Jeff Law
  2 siblings, 0 replies; 18+ messages in thread
From: Aldy Hernandez @ 2011-04-04 12:56 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Richard Henderson, Jeff Law, gcc-patches, Jakub Jelinek


> But well, I guess the thing I don't like about the standard is that it makes
> people that have started to be somewhat aware about threading issues
> _less_ aware of them by providing some "false" safety to them.  It
> really smells like a standard designed for a very high-level language
> where people don't have to think instead of a standard suitable for a
> C family language.

Well, that's not exactly true.  You still need to think about threading. 
  All the standard is doing is guaranteeing that if you already have a 
data race free program, the compiler won't add additional races not 
already there.

But I'm not a C++ guy.  I am no advocate for the standard.  I'm just 
implementing stuff.  Ahem, I'm just a soldier in this war :).

Aldy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-04-02  7:56             ` Richard Guenther
  2011-04-04 12:56               ` Aldy Hernandez
@ 2011-04-04 12:58               ` Aldy Hernandez
  2011-04-06 15:29                 ` Michael Matz
  2011-04-04 18:06               ` Jeff Law
  2 siblings, 1 reply; 18+ messages in thread
From: Aldy Hernandez @ 2011-04-04 12:58 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Richard Henderson, Jeff Law, gcc-patches, Jakub Jelinek


>> (5) Do we agree that all such cpus use a byte-granular modification mask?

> Now, as of (0) I might agree to disregard the original Alpha, but as the
> embedded world moves to SMP I'm not sure we can disregard
> non-cache coherent NUMA setups or even CPUs without a byte store.

As per 5, it doesn't matter if the CPU lacks a byte store, since the 
cache has a byte-granular modification mask.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-04-02  7:56             ` Richard Guenther
  2011-04-04 12:56               ` Aldy Hernandez
  2011-04-04 12:58               ` Aldy Hernandez
@ 2011-04-04 18:06               ` Jeff Law
  2 siblings, 0 replies; 18+ messages in thread
From: Jeff Law @ 2011-04-04 18:06 UTC (permalink / raw)
  To: Richard Guenther
  Cc: Richard Henderson, Aldy Hernandez, gcc-patches, Jakub Jelinek

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/02/11 01:56, Richard Guenther wrote:

> But well, I guess the thing I don't like about the standard is that it makes
> people that have started to be somewhat aware about threading issues
> _less_ aware of them by providing some "false" safety to them.  It
> really smells like a standard designed for a very high-level language
> where people don't have to think instead of a standard suitable for a
> C family language.
I agree it's unfortunate, but there's a general trend of finding ways to
get more out of less experienced programmers.  One of the ways to do
that is to simplify the problem space these guys have to look at.

For better or worse, it's a trend I see continuing indefinitely.
Obviously we're starting to get off-topic..


jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNmgiBAAoJEBRtltQi2kC7OWIH/3pLUy3CpZ/tONDfonXuJOl8
aEotqjL6nmgyweg9poJlYy9MA0kNmCq25oj+TDE1H7w2kDVMAEeJtSxo37VPYS4+
KJtxD6l+J4KNhUbsSxE1oanI1f62Mf/1TZKziKW1AkDI7Ziszz5wwvD6jTU7QiJn
XaLm4gHvYtiwVBC5gPjVm0pqh8UZYpEiAdba9Y9WBSHUriLD0DfBcIwDbU59dlz0
1coYKJiXH5NlKUngFfR+oyO3pvGTgtJKweBcaQQCuV97nLsaOKiMRvVMQDA34afO
etua7nfBxM0JAeWu9ttNEjskFZi+ZG3oe8xtmj3IY5OhY1bzI0ARrbtu26K0/Ts=
=mOsV
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-04-04 12:58               ` Aldy Hernandez
@ 2011-04-06 15:29                 ` Michael Matz
  2011-04-06 17:16                   ` Aldy Hernandez
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Matz @ 2011-04-06 15:29 UTC (permalink / raw)
  To: Aldy Hernandez
  Cc: Richard Guenther, Richard Henderson, Jeff Law, gcc-patches,
	Jakub Jelinek

Hi,

On Mon, 4 Apr 2011, Aldy Hernandez wrote:

> 
> > > (5) Do we agree that all such cpus use a byte-granular modification mask?
> 
> > Now, as of (0) I might agree to disregard the original Alpha, but as the
> > embedded world moves to SMP I'm not sure we can disregard
> > non-cache coherent NUMA setups or even CPUs without a byte store.
> 
> As per 5, it doesn't matter if the CPU lacks a byte store, since the 
> cache has a byte-granular modification mask.

If it doesn't have byte stores there's no need for byte-granular 
modification masks :)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [cxx-mem-model] bitfield tests
  2011-04-06 15:29                 ` Michael Matz
@ 2011-04-06 17:16                   ` Aldy Hernandez
  0 siblings, 0 replies; 18+ messages in thread
From: Aldy Hernandez @ 2011-04-06 17:16 UTC (permalink / raw)
  To: Michael Matz
  Cc: Richard Guenther, Richard Henderson, Jeff Law, gcc-patches,
	Jakub Jelinek

On 04/06/11 10:29, Michael Matz wrote:
> Hi,
>
> On Mon, 4 Apr 2011, Aldy Hernandez wrote:
>
>>
>>>> (5) Do we agree that all such cpus use a byte-granular modification mask?
>>
>>> Now, as of (0) I might agree to disregard the original Alpha, but as the
>>> embedded world moves to SMP I'm not sure we can disregard
>>> non-cache coherent NUMA setups or even CPUs without a byte store.
>>
>> As per 5, it doesn't matter if the CPU lacks a byte store, since the
>> cache has a byte-granular modification mask.
>
> If it doesn't have byte stores there's no need for byte-granular
> modification masks :)

I was talking about a CPU with a byte store that is implemented in the 
microcode with a wider operation and logical operations that may touch 
adjacent fields.  If adjacent bytes were touched, the cache would be 
updated accordingly, hence the byte-granular modification mask.  That's 
my understanding anyhow.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2011-04-06 17:16 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-29 17:28 [cxx-mem-model] bitfield tests Aldy Hernandez
2011-03-30 10:50 ` Richard Guenther
2011-03-30 14:24   ` Aldy Hernandez
2011-03-30 14:25     ` Richard Guenther
2011-03-30 14:41       ` Aldy Hernandez
2011-03-30 14:43         ` Richard Guenther
2011-03-30 15:13           ` Mike Stump
2011-03-31 14:58       ` Jeff Law
2011-03-31 15:35         ` Richard Guenther
2011-04-01 16:24           ` Richard Henderson
2011-04-01 19:42             ` Andrew Pinski
2011-04-02  7:56             ` Richard Guenther
2011-04-04 12:56               ` Aldy Hernandez
2011-04-04 12:58               ` Aldy Hernandez
2011-04-06 15:29                 ` Michael Matz
2011-04-06 17:16                   ` Aldy Hernandez
2011-04-04 18:06               ` Jeff Law
2011-03-30 14:39     ` Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).