From: Florian Weimer <fweimer@redhat.com>
To: DJ Delorie <dj@redhat.com>
Cc: libc-alpha@sourceware.org
Subject: Re: Update mmap() flags and errors lists
Date: Wed, 05 Jun 2024 00:16:17 +0200 [thread overview]
Message-ID: <87tti8qrpq.fsf@oldenburg.str.redhat.com> (raw)
In-Reply-To: <xnzfsxze5q.fsf@greed.delorie.com> (DJ Delorie's message of "Fri, 10 May 2024 14:59:29 -0400")
* DJ Delorie:
> [DJ - information taken from various sources, including man pages
> (which I read, summarized in my notes, ignored for a while, then
> rewrote from my notes and kernel sources - "how to take advantage of
> bad memory" ;) and kernel sources (linux and hurd). I contemplated
> adding a table cross-referencing each flag with the kernels that
> support them and versions introduced, but decided that was too much
> work and detail for the results desired.]
>
> [patch starts here]
>
> Extend the list of MAP_* macros to include all macros available
> to the average program (gcc -E -dM | grep MAP_*)
>
> Extend the list of errno codes.
>
> diff --git a/manual/llio.texi b/manual/llio.texi
> index fae49d1433..2086e04afd 100644
> --- a/manual/llio.texi
> +++ b/manual/llio.texi
> @@ -1573,10 +1573,15 @@ permitted. They include @code{PROT_READ}, @code{PROT_WRITE}, and
> of address space for future use. The @code{mprotect} function can be
> used to change the protection flags. @xref{Memory Protection}.
>
> -@var{flags} contains flags that control the nature of the map.
> -One of @code{MAP_SHARED} or @code{MAP_PRIVATE} must be specified.
> +@var{flags} contains flags that control the nature of the map. One of
> +@code{MAP_SHARED}, @code{MAP_SHARED_VALIDATE}, or @code{MAP_PRIVATE}
> +must be specified. Additional flags may be bitwise OR'd to further
> +define the mapping.
While you are adding this, please avoid starting a sentence with @var,
so something like:
[The] @var{flags} [parameter] contains …
> -They include:
> +Note that, aside from @code{MAP_PRIVATE} and @code{MAP_SHARED}, not
> +all flags are supported on all versions of all operating systems.
> +Consult the kernel-specific documenation for details. The flags
> +include:
typo: documen[t]ation
> +@item MAP_SHARED_VALIDATE
> +Similar to @code{MAP_SHARED} except that additional flags will be
> +validated by the kernel, and the call will fail if an unrecognized
> +flag is provided. With @code{MAP_SHARED} using a flag on a kernel
> +that doesn't support it causes the flag to be ignored.
> +@code{MAP_SHARED_VALIDATE} should be used when the behavior of all
> +flags is required.
This leads to the question what to do if you want this checking behavior
with MAP_PRIVATE instead of MAP_SHARED.
> +
> @item MAP_FIXED
> This forces the system to use the exact mapping address specified in
> -@var{address} and fail if it can't.
> +@var{address} and fail if it can't. Note that if the new mapping
> +would overlap an existing mapping, the existing map is unmapped.
This is misleading, I believe. The overlapping part is replaced with
the new mapping. If the overlap is incomplete, part of the previous
mapping remains.
> +@item MAP_HUGE_16KB
> +@dots{}
> +@item MAP_HUGE_16GB
> +Some architectures support more than one size of ``huge'' pages for
> +@code{MAP_HUGETLB}. These flags allow the caller to choose amongst
> +them. Note that while the ABI allows the caller to specify arbitrary
> +page sizes, not all sizes have corresponding defined macros, and not
> +all defined macros correspond to sizes supported by the kernel. It is
> +up to the programmer to only ask for huge page sizes that are known to
> +be supported.
These we do not support? (We probably should.)
> +@item MAP_32BIT
> +Require addresses that can be accessed with a 32 bit pointer, i.e.,
> +within the first 4 GiB. Ignored if MAP_FIXED is specified.
> +
> +@item MAP_DENYWRITE
> +@item MAP_EXECUTABLE
> +@item MAP_FILE
> +
> +Provided for compatibility. Ignored by the Linux kernel.
I thought that some corner cases still handle MAP_DENYWRITE?
> +@item MAP_FIXED_NOREPLACE
> +Similar to @code{MAP_FIXED} except the call will fail with
> +@code{EEXIST} if the new mapping would overwrite an existing mapping.
How does this interact with MAP_SHARED_VALIDATE above? Can it be
combined with MAP_FIXED?
> +@item MAP_GROWSDOWN
> +This flag is used to make stacks, and is typically only needed inside
> +the program loader to set up the main stack and thread stacks for the
> +running process. The mapping is created according to the other flags,
> +except an additional page just prior to the mapping is marked as a
> +``guard page''. If a write is attempted inside this guard page, that
> +page is mapped, the mapping is extended, and a new guard page is
> +created. Thus, the mapping continues to grow towards lower addresses
> +until it encounters some other mapping.
Maybe reference -fstack-clash-protection, and note that @theglibc{} does
not use this for thread stacks?
> +@item MAP_LOCKED
> +Requests that mapped pages are locked in memory (i.e. not paged out).
> +Note that this is a request and not a requirement; use @code{mlock} if
> +locking is mandatory.
> +
> +@item MAP_POPULATE
> +@item MAP_NONBLOCK
> +These two are opposites. @code{MAP_POPULATE} requests that the kernel
> +read-ahead a file-backed mapping, causing more pages to be mapped
> +before they're needed. @code{MAP_NONBLOCK} requests that the kernel
> +@emph{not} attempt such, only mapping pages when they're actually
> +needed.
MAP_POPULATE is just a hint, right? And even with mlockall, or
MAP_LOCKED, it does not guarantee the absence of future page faults.
> +@item MAP_NORESERVE
> +Asks the kernel to not reserve physical backing for a mapping. This
> +would be useful for, for example, a very large but sparsely used
> +mapping which need not be limited in span by available RAM or swap.
> +Note that writes to such a mapping may cause a @code{SIGSEGV} if the
> +amount of backing required eventualy exceeds system resources.
> +
> +On Linux, this flag's behavior may be overwridden by
> +@code{/proc/sys/vm/overcommit_memory} as documented in swap(5).
Shoud @xref the man-pages section added in the other patch. However,
swap(5) does not appear to exist?
> +@item MAP_STACK
> +Ensures that the resulting mapping is suitable for use as a program
> +stack. For example, the use of huge pages might be precluded.
> +
> +@item MAP_SYNC
> +This flag is used to map persistent memory devices into the running
> +program in such a way that writes to the mapping are immediately
> +written to the device as well. Unlike most other flags, this one will
> +fail unless @code{MAP_SHARED_VALIDATE} is also given.
Is this about DAX?
> +@item MAP_UNINITIALIZED
> +This flag allows the kernel to map anonymous pages without zeroing
> +them out first. This is, of course, a security risk, and will only
> +work if the kernel is built to allow it (typically, on single-process
> +embedded systems).
>
> @end vtable
>
> @@ -1655,6 +1735,24 @@ Possible errors include:
>
> @table @code
>
> +@item EACCES
> +
> +@var{filedes} was not open for the type of access specified in @var{protect}.
> +
> +@item EAGAIN
> +
> +Either the underlying file is locked, or the system has temporarily
> +run out of resources.
See below, I think the reference about locking is spurious.
> +@item EBADF
> +
> +The @var{fd} passes is invalid, and a valid file descriptor is required.
Is a file descriptor ever required?
> +@item EEXIST
> +
> +@code{MAP_FIXED_NOREPLACE} was specified and an existing mapping was
> +found in the requested address range.
See my comment above for MAP_FIXED_NOREPLACE.
> @item EINVAL
>
> Either @var{address} was unusable (because it is not a multiple of the
> @@ -1663,28 +1761,35 @@ applicable page size), or inconsistent @var{flags} were given.
> If @code{MAP_HUGETLB} was specified, the file or system does not support
> large page sizes.
>
> -@item EACCES
> +@item ENFILE
>
> -@var{filedes} was not open for the type of access specified in @var{protect}.
> +There are too many open files in the system.
Can this error actually happen? It's a bit surprising.
> +@item ENODEV
> +
> +This file is of a type that doesn't support mapping.
>
> @item ENOMEM
>
> Either there is not enough memory for the operation, or the process is
> out of address space.
This should probably reference vm.max_map_count.
> -@c On Linux, EAGAIN will appear if the file has a conflicting mandatory lock.
> -@c However mandatory locks are not discussed in this manual.
Mandatory locks are disabled in pretty much all kernels out there, no?
> +@item EOVERFLOW
> +
> +Either the offset into the file causes the page counts to exceed the
> +range of a 32 bit number, or the offset requested exceeds the length
> +of the file.
The reference to page size may be incorrect. I think it's a fixed
offset regardless of page size on systems that can't pass a 64-bit file
offset.
> +@item ETXTBSY
> +
> +@code{MAP_DENYWRITE} was specified, but the file descriptor given was
> +open for writing.
This seems to contradict the earlier suggestion that MAP_DENYWRITE is
ignored.
Thanks,
Florian
next prev parent reply other threads:[~2024-06-04 22:16 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-10 18:59 DJ Delorie
2024-06-04 22:16 ` Florian Weimer [this message]
2024-06-05 4:10 ` DJ Delorie
2024-06-05 6:38 ` Florian Weimer
2024-06-05 18:42 ` DJ Delorie
2024-06-14 8:14 ` Florian Weimer
2024-06-14 16:40 ` DJ Delorie
2024-06-05 18:50 ` [v3] " DJ Delorie
2024-06-14 8:21 ` Florian Weimer
2024-06-14 18:19 ` DJ Delorie
2024-06-14 18:46 ` [v4] " DJ Delorie
2024-06-18 20:13 ` Mathieu Desnoyers
2024-06-18 20:57 ` DJ Delorie
2024-06-21 13:02 ` Mathieu Desnoyers
2024-06-21 16:17 ` DJ Delorie
2024-06-21 16:20 ` Mathieu Desnoyers
2024-06-19 7:16 ` Florian Weimer
2024-06-05 4:11 ` [v2] " DJ Delorie
2024-06-05 7:44 ` Andreas Schwab
2024-06-05 18:42 ` DJ Delorie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tti8qrpq.fsf@oldenburg.str.redhat.com \
--to=fweimer@redhat.com \
--cc=dj@redhat.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).