public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Florian Weimer <fweimer@redhat.com>
To: DJ Delorie <dj@redhat.com>
Cc: libc-alpha@sourceware.org
Subject: Re: Update mmap() flags and errors lists
Date: Wed, 05 Jun 2024 00:16:17 +0200	[thread overview]
Message-ID: <87tti8qrpq.fsf@oldenburg.str.redhat.com> (raw)
In-Reply-To: <xnzfsxze5q.fsf@greed.delorie.com> (DJ Delorie's message of "Fri, 10 May 2024 14:59:29 -0400")

* DJ Delorie:

> [DJ - information taken from various sources, including man pages
> (which I read, summarized in my notes, ignored for a while, then
> rewrote from my notes and kernel sources - "how to take advantage of
> bad memory" ;) and kernel sources (linux and hurd).  I contemplated
> adding a table cross-referencing each flag with the kernels that
> support them and versions introduced, but decided that was too much
> work and detail for the results desired.]
>
> [patch starts here]
>
> Extend the list of MAP_* macros to include all macros available
> to the average program (gcc -E -dM | grep MAP_*)
>
> Extend the list of errno codes.
>
> diff --git a/manual/llio.texi b/manual/llio.texi
> index fae49d1433..2086e04afd 100644
> --- a/manual/llio.texi
> +++ b/manual/llio.texi
> @@ -1573,10 +1573,15 @@ permitted.  They include @code{PROT_READ}, @code{PROT_WRITE}, and
>  of address space for future use.  The @code{mprotect} function can be
>  used to change the protection flags.  @xref{Memory Protection}.
>  
> -@var{flags} contains flags that control the nature of the map.
> -One of @code{MAP_SHARED} or @code{MAP_PRIVATE} must be specified.
> +@var{flags} contains flags that control the nature of the map.  One of
> +@code{MAP_SHARED}, @code{MAP_SHARED_VALIDATE}, or @code{MAP_PRIVATE}
> +must be specified.  Additional flags may be bitwise OR'd to further
> +define the mapping.

While you are adding this, please avoid starting a sentence with @var,
so something like:

  [The] @var{flags} [parameter] contains …

> -They include:
> +Note that, aside from @code{MAP_PRIVATE} and @code{MAP_SHARED}, not
> +all flags are supported on all versions of all operating systems.
> +Consult the kernel-specific documenation for details.  The flags
> +include:

typo: documen[t]ation

> +@item MAP_SHARED_VALIDATE
> +Similar to @code{MAP_SHARED} except that additional flags will be
> +validated by the kernel, and the call will fail if an unrecognized
> +flag is provided.  With @code{MAP_SHARED} using a flag on a kernel
> +that doesn't support it causes the flag to be ignored.
> +@code{MAP_SHARED_VALIDATE} should be used when the behavior of all
> +flags is required.

This leads to the question what to do if you want this checking behavior
with MAP_PRIVATE instead of MAP_SHARED.

> +
>  @item MAP_FIXED
>  This forces the system to use the exact mapping address specified in
> -@var{address} and fail if it can't.
> +@var{address} and fail if it can't.  Note that if the new mapping
> +would overlap an existing mapping, the existing map is unmapped.

This is misleading, I believe.  The overlapping part is replaced with
the new mapping.  If the overlap is incomplete, part of the previous
mapping remains.

> +@item MAP_HUGE_16KB
> +@dots{}
> +@item MAP_HUGE_16GB
> +Some architectures support more than one size of ``huge'' pages for
> +@code{MAP_HUGETLB}.  These flags allow the caller to choose amongst
> +them.  Note that while the ABI allows the caller to specify arbitrary
> +page sizes, not all sizes have corresponding defined macros, and not
> +all defined macros correspond to sizes supported by the kernel.  It is
> +up to the programmer to only ask for huge page sizes that are known to
> +be supported.

These we do not support?  (We probably should.)

> +@item MAP_32BIT
> +Require addresses that can be accessed with a 32 bit pointer, i.e.,
> +within the first 4 GiB.  Ignored if MAP_FIXED is specified.
> +
> +@item MAP_DENYWRITE
> +@item MAP_EXECUTABLE
> +@item MAP_FILE
> +
> +Provided for compatibility.  Ignored by the Linux kernel.

I thought that some corner cases still handle MAP_DENYWRITE?

> +@item MAP_FIXED_NOREPLACE
> +Similar to @code{MAP_FIXED} except the call will fail with
> +@code{EEXIST} if the new mapping would overwrite an existing mapping.

How does this interact with MAP_SHARED_VALIDATE above?  Can it be
combined with MAP_FIXED?

> +@item MAP_GROWSDOWN
> +This flag is used to make stacks, and is typically only needed inside
> +the program loader to set up the main stack and thread stacks for the
> +running process.  The mapping is created according to the other flags,
> +except an additional page just prior to the mapping is marked as a
> +``guard page''.  If a write is attempted inside this guard page, that
> +page is mapped, the mapping is extended, and a new guard page is
> +created.  Thus, the mapping continues to grow towards lower addresses
> +until it encounters some other mapping.

Maybe reference -fstack-clash-protection, and note that @theglibc{} does
not use this for thread stacks?

> +@item MAP_LOCKED
> +Requests that mapped pages are locked in memory (i.e. not paged out).
> +Note that this is a request and not a requirement; use @code{mlock} if
> +locking is mandatory.
> +
> +@item MAP_POPULATE
> +@item MAP_NONBLOCK
> +These two are opposites.  @code{MAP_POPULATE} requests that the kernel
> +read-ahead a file-backed mapping, causing more pages to be mapped
> +before they're needed.  @code{MAP_NONBLOCK} requests that the kernel
> +@emph{not} attempt such, only mapping pages when they're actually
> +needed.

MAP_POPULATE is just a hint, right?  And even with mlockall, or
MAP_LOCKED, it does not guarantee the absence of future page faults.

> +@item MAP_NORESERVE
> +Asks the kernel to not reserve physical backing for a mapping.  This
> +would be useful for, for example, a very large but sparsely used
> +mapping which need not be limited in span by available RAM or swap.
> +Note that writes to such a mapping may cause a @code{SIGSEGV} if the
> +amount of backing required eventualy exceeds system resources.
> +
> +On Linux, this flag's behavior may be overwridden by
> +@code{/proc/sys/vm/overcommit_memory} as documented in swap(5).

Shoud @xref the man-pages section added in the other patch.  However,
swap(5) does not appear to exist?

> +@item MAP_STACK
> +Ensures that the resulting mapping is suitable for use as a program
> +stack.  For example, the use of huge pages might be precluded.
> +
> +@item MAP_SYNC
> +This flag is used to map persistent memory devices into the running
> +program in such a way that writes to the mapping are immediately
> +written to the device as well.  Unlike most other flags, this one will
> +fail unless @code{MAP_SHARED_VALIDATE} is also given.

Is this about DAX?

> +@item MAP_UNINITIALIZED
> +This flag allows the kernel to map anonymous pages without zeroing
> +them out first.  This is, of course, a security risk, and will only
> +work if the kernel is built to allow it (typically, on single-process
> +embedded systems).
>  
>  @end vtable
>  
> @@ -1655,6 +1735,24 @@ Possible errors include:
>  
>  @table @code
>  
> +@item EACCES
> +
> +@var{filedes} was not open for the type of access specified in @var{protect}.
> +
> +@item EAGAIN
> +
> +Either the underlying file is locked, or the system has temporarily
> +run out of resources.

See below, I think the reference about locking is spurious.

> +@item EBADF
> +
> +The @var{fd} passes is invalid, and a valid file descriptor is required.

Is a file descriptor ever required?

> +@item EEXIST
> +
> +@code{MAP_FIXED_NOREPLACE} was specified and an existing mapping was
> +found in the requested address range.

See my comment above for MAP_FIXED_NOREPLACE.

>  @item EINVAL
>  
>  Either @var{address} was unusable (because it is not a multiple of the
> @@ -1663,28 +1761,35 @@ applicable page size), or inconsistent @var{flags} were given.
>  If @code{MAP_HUGETLB} was specified, the file or system does not support
>  large page sizes.
>  
> -@item EACCES
> +@item ENFILE
>  
> -@var{filedes} was not open for the type of access specified in @var{protect}.
> +There are too many open files in the system.

Can this error actually happen?  It's a bit surprising.

> +@item ENODEV
> +
> +This file is of a type that doesn't support mapping.
>  
>  @item ENOMEM
>  
>  Either there is not enough memory for the operation, or the process is
>  out of address space.

This should probably reference vm.max_map_count.

> -@c On Linux, EAGAIN will appear if the file has a conflicting mandatory lock.
> -@c However mandatory locks are not discussed in this manual.

Mandatory locks are disabled in pretty much all kernels out there, no?

> +@item EOVERFLOW
> +
> +Either the offset into the file causes the page counts to exceed the
> +range of a 32 bit number, or the offset requested exceeds the length
> +of the file.

The reference to page size may be incorrect.  I think it's a fixed
offset regardless of page size on systems that can't pass a 64-bit file
offset.

> +@item ETXTBSY
> +
> +@code{MAP_DENYWRITE} was specified, but the file descriptor given was
> +open for writing.

This seems to contradict the earlier suggestion that MAP_DENYWRITE is
ignored.

Thanks,
Florian


  reply	other threads:[~2024-06-04 22:16 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-10 18:59 DJ Delorie
2024-06-04 22:16 ` Florian Weimer [this message]
2024-06-05  4:10   ` DJ Delorie
2024-06-05  6:38     ` Florian Weimer
2024-06-05 18:42       ` DJ Delorie
2024-06-14  8:14         ` Florian Weimer
2024-06-14 16:40           ` DJ Delorie
2024-06-05 18:50       ` [v3] " DJ Delorie
2024-06-14  8:21         ` Florian Weimer
2024-06-14 18:19           ` DJ Delorie
2024-06-14 18:46           ` [v4] " DJ Delorie
2024-06-18 20:13             ` Mathieu Desnoyers
2024-06-18 20:57               ` DJ Delorie
2024-06-21 13:02                 ` Mathieu Desnoyers
2024-06-21 16:17                   ` DJ Delorie
2024-06-21 16:20                     ` Mathieu Desnoyers
2024-06-19  7:16             ` Florian Weimer
2024-06-05  4:11   ` [v2] " DJ Delorie
2024-06-05  7:44     ` Andreas Schwab
2024-06-05 18:42       ` DJ Delorie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tti8qrpq.fsf@oldenburg.str.redhat.com \
    --to=fweimer@redhat.com \
    --cc=dj@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).