public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Reading POD data directly from mmapped memory in C++
@ 2021-03-01 22:21 Thomas Bleher
  2021-03-01 23:25 ` Marc Glisse
  0 siblings, 1 reply; 2+ messages in thread
From: Thomas Bleher @ 2021-03-01 22:21 UTC (permalink / raw)
  To: gcc-help

Hello,

I'm wondering if G++ has special guarantees for accessing mmapped
memory.

My use-case:
- I have some large data sets (>10GB) in a file, mostly float arrays,
  plus some POD management structs
- I need to share this data read-only between unrelated processes

To support this, I'd like to mmap the file containing the data into the
processes that access it (properly aligned of course), and read it
directly. I cannot memcpy the whole content, since then the data sharing
between processes would be lost.

However, my understanding is that C++ basically only allows memcpy from
such data, but no direct access e.g. as float array (which would be UB
in C++, because of the object lifetime rules).
See
https://stackoverflow.com/questions/55034863/dealing-with-undefined-behavior-when-using-reinterpret-cast-in-a-memory-mapping
for a very similar question from someone else.

My question: does g++ guarantee anything beyond ISO C++ in this regard?
Using mmap to share data between processes sounds very useful, so it
would be a pity if this was impossible.

Alternatively, I could write the direct data access in C, and call that
from C++. Basically, have a C function that mmaps the file, finds the
start of the float array in the file, and returns a 'const float*'
pointing to the start of the array. My understanding is that this is
legal in C. Would the returned pointer also be legal to use in C++?
(Assuming gcc and g++ as compiler; I'm asking about the guarantees GCC
gives beyond ISO C and ISO C++).

Thanks,
Thomas

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Reading POD data directly from mmapped memory in C++
  2021-03-01 22:21 Reading POD data directly from mmapped memory in C++ Thomas Bleher
@ 2021-03-01 23:25 ` Marc Glisse
  0 siblings, 0 replies; 2+ messages in thread
From: Marc Glisse @ 2021-03-01 23:25 UTC (permalink / raw)
  To: Thomas Bleher; +Cc: gcc-help

On Mon, 1 Mar 2021, Thomas Bleher via Gcc-help wrote:

> I'm wondering if G++ has special guarantees for accessing mmapped
> memory.
>
> My use-case:
> - I have some large data sets (>10GB) in a file, mostly float arrays,
>  plus some POD management structs
> - I need to share this data read-only between unrelated processes
>
> To support this, I'd like to mmap the file containing the data into the
> processes that access it (properly aligned of course), and read it
> directly. I cannot memcpy the whole content, since then the data sharing
> between processes would be lost.
>
> However, my understanding is that C++ basically only allows memcpy from
> such data, but no direct access e.g. as float array (which would be UB
> in C++, because of the object lifetime rules).

If you really wanted, you could write a wrapper so that operator[] does a 
memcpy of just the 4 bytes you want into a float and returns that float, 
which would be optimized to a plain read.

> My question: does g++ guarantee anything beyond ISO C++ in this regard?
> Using mmap to share data between processes sounds very useful, so it
> would be a pity if this was impossible.

I don't know how official this is, but I believe you are safe. mmap is an 
opaque function, for all gcc knows, mmap may have actually created objects 
of the expected types in those locations, so gcc cannot make any 
optimization that would break that case.

Most issues come if you try to do 2 conflicting operations on the same 
memory location, which is not your case.

It might cause issues in some intrusive debugging mode that would track 
all objects, but I don't think I've seen a tool like that.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-03-01 23:25 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-01 22:21 Reading POD data directly from mmapped memory in C++ Thomas Bleher
2021-03-01 23:25 ` Marc Glisse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).