Re: UB status of snprintf on invalid ptr+size combination?

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

From: Alejandro Colomar <alx.manpages@gmail.com>
To: Vincent Lefevre <vincent@vinc17.net>, libc-alpha@sourceware.org
Cc: Stephan Bergmann <sbergman@redhat.com>,
	Andreas Schwab <schwab@linux-m68k.org>,
	Paul Eggert <eggert@cs.ucla.edu>,
	Simon Chopin <simon.chopin@canonical.com>
Subject: Re: UB status of snprintf on invalid ptr+size combination?
Date: Mon, 20 Mar 2023 17:15:49 +0100	[thread overview]
Message-ID: <ba3499ce-f296-5368-e9ed-f55526fb4b07@gmail.com> (raw)
In-Reply-To: <20230320150929.GA283644@cventin.lip.ens-lyon.fr>


[-- Attachment #1.1: Type: text/plain, Size: 6801 bytes --]

Hi Vincent,

On 3/20/23 14:36, Vincent Lefevre wrote:
> On 2023-03-20 13:17:11 +0100, Alejandro Colomar wrote:
>> On 3/20/23 13:05, Siddhesh Poyarekar wrote:
>>> On 2023-03-19 19:07, Vincent Lefevre wrote:
>>>> The function may not know the buffer size if `checked` is true,
>>>> so that it uses a known bound. Thanks to common code factorized,
>>>> this is more readable than
>>>>
>>>>    if (checked)
>>>>      sprintf (buf, "%s", s);
>>>>    else
>>>>      snprintf(buf, n, "%s", s);
>>>>
>>>> in particular in the cases where the format string is complex.
>>
>> That pattern looks like _FORTIFY_SOURCE, doesn't it?  If so, the correct
>> action would be to call sprintf(3) and rely on the compiler to do the
>> checks.
> 
> Not necessarily. Here, this is the programmer who would do the check,
> though he may add code (e.g. assertions) on the caller side to give
> enough information to the compiler so that a smart compiler could
> prove that the call is valid. But this may need an analysis across
> different compilation units. If the compiler cannot check, the
> validity relies on the proof of the code; but not that in any case,
> the code needs to be proved: for instance, snprintf() may yield
> data loss and potential security issues if string truncation is
> not expected (or if the string is truncation too early).
> 

$ cat str.c 
#include <stdio.h>

void f(char *dst, char *src)
{
	sprintf(dst, "%s plus some extra stuff", src);
}

$ cat main.c 
void f(char *dst, char *src);

int main(void)
{
	char *str = "some long string";
	char s[20];

	f(s, str);
}

$ cc -Wall -Wextra *.c -flto -O3 -fanalyzer -D_FORTIFY_SOURCE=1
$ ./a.out 
*** buffer overflow detected ***: terminated
Aborted
$ cc -Wall -Wextra *.c -flto -O3 -fanalyzer -D_FORTIFY_SOURCE=3
alx@asus5775:~/tmp/fort$ ./a.out 
*** buffer overflow detected ***: terminated
Aborted


Is this what you're looking for?  I agree that it would be nicer
if the analyzer could catch this at build time, but it seems it's
not yet so powerful.


On 3/20/23 14:50, Vincent Lefevre wrote:
> On 2023-03-20 08:05:32 -0400, Siddhesh Poyarekar wrote:
>> I think on the glibc front it makes sense from a security
>> perspective to interpret this through POSIX than the C standard.
>> Even if the C standard is clarified to be contrary to POSIX and
>> explicitly state that n is not the size of the buffer (which would
>> be a terrible mistake IMO), I'd lean towards violating the C
>> standard and conforming to POSIX instead.
> I disagree about the POSIX behavior (assuming it is intentional).
> With it, if the compiler detects that n is larger than the actual
> buffer size, then due to undefined behavior, the compiler could
> assume that this is dead code and introduce erratic behavior in
> code written with the C standard in mind (or when it was introduced
> in BSD).
> 
> Note that the printf(3) man page from FreeBSD 1.0 (1991), just says:
> 
>   Snprintf() and vsnprintf() will write at most size-1 of the
>   characters printed into the output string (the size'th character
>   then gets the terminating `\0'); if the return value is greater
>   than or equal to the size argument, the string was too short and
>   some of the printed characters were discarded.
The Linux man-pages show the following SYNOPSIS, which shows the
meaning of 'size' as being an array size:


SYNOPSIS
       #include <stdio.h>

       int printf(const char *restrict format, ...);
       int fprintf(FILE *restrict stream,
                   const char *restrict format, ...);
       int dprintf(int fd,
                   const char *restrict format, ...);
       int sprintf(char *restrict str,
                   const char *restrict format, ...);
       int snprintf(char str[restrict .size], size_t size,
                   const char *restrict format, ...);

       int vprintf(const char *restrict format, va_list ap);
       int vfprintf(FILE *restrict stream,
                   const char *restrict format, va_list ap);
       int vdprintf(int fd,
                   const char *restrict format, va_list ap);
       int vsprintf(char *restrict str,
                   const char *restrict format, va_list ap);
       int vsnprintf(char str[restrict .size], size_t size,
                   const char *restrict format, va_list ap);


I'll need to update the text, though, to be more explicit in
that it's an array size, and not just a limit for the copying.


> 
> Reference:
> https://man.freebsd.org/cgi/man.cgi?query=snprintf&apropos=0&sektion=0&manpath=FreeBSD+1.0-RELEASE&arch=default&format=html
> 
> -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


On 3/20/23 16:09, Vincent Lefevre wrote:
> On 2023-03-19 10:45:59 -0400, manfred via Libc-alpha wrote:
>> All of that said, back to the OP case I would not pass INT_MAX to snprintf.
>> If I have a situation wherein I know that the buffer is large enough, but I
>> don't know its exact size, I'd use sprintf and be done with it. (I'm sure
>> that the actual code is more elaborate than this, but still)
> 
> Here's another example where the support of snprintf with a large n
> argument (larger than the buffer size) may be used:
> 
> In GNU MPFR, for our function mpfr_snprintf, we have assumed a
> behavior analogue to the ISO C behavior. In particular, we use
> that in our tests in order to check that large values of n are
> correctly handled. This allowed us to trigger/detect a bug in
> the choice of the integer types in our implementation:
> 
> The test:
> 
> https://gitlab.inria.fr/mpfr/mpfr/-/commit/67a75bfe41d3a7f95367ee9e62bd7dfc73e5b395
> 
> The bug fix (replacing an int by a size_t in a variable declaration):
> 
> https://gitlab.inria.fr/mpfr/mpfr/-/commit/6b8cf3e2bdc285027627281cac230ed932c1b73f

I don't understand how snprintf(3) helped catch the bug.  Could
you show some small reproducer?  Would _FORTIFY_SOURCE not do a
similar thing?

Cheers,

Alex

> 
> Note that the MPFR code currently does not use snprintf or any
> similar function, so that there should be no issues with this
> test if snprintf does not support n > buffer size, it but may
> use such a function in the future. (Indeed, MPFR currently uses
> gmp_vasprintf and gmp_asprintf, thus generating a full output
> before truncating it as needed, so that this is potentially
> very inefficient and could unnecessarily exhaust the memory
> and/or crash.)
> 

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

next prev parent reply	other threads:[~2023-03-20 16:16 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-14 19:47 Simon Chopin
2023-03-14 21:39 ` Paul Eggert
2023-03-15  9:22   ` Andreas Schwab
2023-03-15 15:54     ` Siddhesh Poyarekar
2023-03-15 18:34     ` Michael Hudson-Doyle
2023-03-19 14:45     ` manfred
2023-03-19 23:07       ` Vincent Lefevre
2023-03-20 12:05         ` Siddhesh Poyarekar
2023-03-20 12:17           ` Alejandro Colomar
2023-03-20 12:29             ` Siddhesh Poyarekar
2023-03-20 13:36             ` Vincent Lefevre
2023-03-20 13:50           ` Vincent Lefevre
2023-03-20 16:56             ` Siddhesh Poyarekar
2023-03-20 17:36               ` Vincent Lefevre
2023-03-20 15:09       ` Vincent Lefevre
2023-03-20 16:15         ` Alejandro Colomar [this message]
2023-03-20 16:33           ` Vincent Lefevre
2023-03-20 17:00           ` Vincent Lefevre
2023-03-20 17:31             ` Siddhesh Poyarekar
2023-03-20 17:45               ` Vincent Lefevre
2023-03-15 12:39   ` Vincent Lefevre
2023-03-16 10:29     ` Stephan Bergmann
2023-03-18  2:07       ` Vincent Lefevre
2023-03-18  2:30         ` Alejandro Colomar
2023-03-18 10:58           ` Vincent Lefevre
2023-03-18 15:01             ` Andreas Schwab
2023-03-19 22:48               ` Vincent Lefevre
2023-03-19 23:24                 ` Andreas Schwab
2023-03-20  4:10                   ` Vincent Lefevre
2023-03-20  9:19                     ` Andreas Schwab
2023-03-20 10:42                       ` Vincent Lefevre
2023-03-20 10:44                         ` Andreas Schwab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ba3499ce-f296-5368-e9ed-f55526fb4b07@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=eggert@cs.ucla.edu \
    --cc=libc-alpha@sourceware.org \
    --cc=sbergman@redhat.com \
    --cc=schwab@linux-m68k.org \
    --cc=simon.chopin@canonical.com \
    --cc=vincent@vinc17.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).