public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Martin Sebor <msebor@gmail.com>
To: Alejandro Colomar <alx.manpages@gmail.com>,
	Andrew Pinski <pinskia@gmail.com>
Cc: gcc@gcc.gnu.org, linux-man <linux-man@vger.kernel.org>,
	GNU C Library <libc-alpha@sourceware.org>, groff <groff@gnu.org>,
	Martin Sebor <msebor@redhat.com>
Subject: Re: [-Wstringop-overflow=] strncat(3)
Date: Thu, 15 Dec 2022 13:50:00 -0700	[thread overview]
Message-ID: <746bce22-9b6f-9452-5f0e-0e257738a03d@gmail.com> (raw)
In-Reply-To: <c306794f-8bfd-6b88-0baa-352d4c8b6871@gmail.com>

On 12/14/22 16:14, Alejandro Colomar via Libc-alpha wrote:
> [CC += groff]
> 
> Hi Andrew,
> 
> On 12/14/22 23:57, Andrew Pinski wrote:
>> On Wed, Dec 14, 2022 at 2:46 PM Alejandro Colomar via Libc-alpha
>> <libc-alpha@sourceware.org> wrote:
>>>
>>> Hi,
>>>
>>> I was rewriting the strncat(3) manual page, and when I tried to 
>>> compile the
>>> example program, I got a surprise from the compiler.
>>>
>>> Here goes the page:
>>>
>>>
>>>     strncat(3)               Library Functions Manual              
>>> strncat(3)
>>>
>>>     NAME
>>>            strncat  -  concatenate  a  null‐padded  character 
>>> sequence into a
>>>            string
>>>
>>>     LIBRARY
>>>            Standard C library (libc, -lc)
>>>
>>>     SYNOPSIS
>>>            #include <string.h>
>>>
>>>            char *strncat(char *restrict dst, const char src[restrict 
>>> .sz],
>>>                           size_t sz);
>>>
>>>     DESCRIPTION
>>>            This function catenates the input character sequence 
>>> contained  in
>>>            a  null‐padded  fixed‐width  buffer,  into  a string at 
>>> the buffer
>>>            pointed to by dst.  The programmer is responsible for 
>>> allocating a
>>>            buffer large enough, that is, strlen(dst) + strnlen(src, 
>>> sz) + 1.
>>>
>>>            An implementation of this function might be:
>>>
>>>                char *
>>>                strncat(char *restrict dst, const char *restrict src, 
>>> size_t sz)
>>>                {
>>>                    int   len;
>>>                    char  *end;
>>>
>>>                    len = strnlen(src, sz);
>>>                    end = dst + strlen(dst);
>>>                    end = mempcpy(end, src, len);
>>>                    *end = '\0';
>>>
>>>                    return dst;
>>>                }
>>>
>>>     RETURN VALUE
>>>            strncat() returns dest.
>>>
>>>     ATTRIBUTES
>>>            [...]
>>>
>>>     STANDARDS
>>>            POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>>>
>>>     CAVEATS
>>>            The  name of this function is confusing.  This function 
>>> has no re‐
>>>            lation with strncpy(3).
>>>
>>>            If the destination buffer is not large enough, the 
>>> behavior is un‐
>>>            defined.  See _FORTIFY_SOURCE in feature_test_macros(7).
>>>
>>>     BUGS
>>>            This function  can  be  very  inefficient.   Read  about  
>>> Shlemiel
>>>            the       painter      
>>> ⟨https://www.joelonsoftware.com/2001/12/11/
>>>            back-to-basics/⟩.
>>>
>>>     EXAMPLES
>>>            #include <stdio.h>
>>>            #include <stdlib.h>
>>>            #include <string.h>
>>>
>>>            int
>>>            main(void)
>>>            {
>>>                char    buf[BUFSIZ];
>>>                size_t  len;
>>>
>>>                buf[0] = '\0';  // There’s no ’cpy’ function to this 
>>> ’cat’.
>>>                strncat(buf, "Hello ", 6);

There's nothing wrong with this but the two lines above would be
more simply coded as one:

   strcpy(buf, "Hello ");

The original code suggests a misunderstanding of strncpy's purpose:
that it writes exactly 6 bytes into the destination.  That's what
the warning points out.

>>>                strncat(buf, "world", 42);  // Padding null bytes 
>>> ignored.
>>>                strncat(buf, "!", 1);
>>>                len = strlen(buf);
>>>                printf("[len = %zu]: <%s>\n", len, buf);
>>>
>>>                exit(EXIT_SUCCESS);
>>>            }
>>>
>>>     SEE ALSO
>>>            string(3), string_copy(3)
>>>
>>>     Linux man‐pages (unreleased)      (date)                       
>>> strncat(3)
>>>
>>>
>>> And when you compile that, you get:
>>>
>>> $ cc -Wall -Wextra ./strncat.c
>>> ./strncat.c: In function ‘main’:
>>> ./strncat.c:12:12: warning: ‘strncat’ specified bound 6 equals source 
>>> length
>>> [-Wstringop-overflow=]
>>>      12 |            strncat(buf, "Hello ", 6);
>>>         |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>>> ./strncat.c:14:12: warning: ‘strncat’ specified bound 1 equals source 
>>> length
>>> [-Wstringop-overflow=]
>>>      14 |            strncat(buf, "!", 1);
>>>         |            ^~~~~~~~~~~~~~~~~~~~
>>>
>>>
>>> So, what?  Where's the problem?  This function does exactly that: 
>>> "take an
>>> unterminated character sequence and catenate it to an existing 
>>> string".

Strncat has historically had two distinct use cases.  One of them
-- to constrain the amount of data to copy to the space remaining
in the destination -- gained popularity with the push to reduce
buffer overflow weaknesses in code.  Mistakes in these uses gave
rise to a whole other class of security bugs, to the extent that
CERT felt it necessary to publish the strncpy and strncat best
practice.  The GCC warning in turn was added to support the CERT
guideline.  I touch on some of this in a blog post I wrote a few
years ago:

https://developers.redhat.com/blog/2018/05/24/detecting-string-truncation-with-gcc-8

The specific uses of the function above are contrived (there's
no point in calling strncat to append the full string -- strcat
will do that more clearly and efficiently) but the general use
case -- limiting the amount of copied data to an initial
substring of the source sequence -- although valid and originally
intended (it's one of the two uses of the function in UNIX v7),
is not one that either the guideline or the warning consider.
They can only consider one use cases, and they chose the one
that was observed behind security bugs.  That choice unavoidably
leads to some false positives.  The expected way to deal with
them is to suppress the warning by one of the usual mechanisms
(command line option or #pragma GCC diagnostic).

Martin

>>>  Clang
>>> seems to be fine with the code.
>>
>> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83404 and the
>> background of why the warning was added here:
>>
>> https://www.us-cert.gov/bsi/articles/knowledge/coding-practices/strncpy-and-strncat. 
>>
> 
> This document is bogus, since it's puting strncpy(3) and strncat(3) in 
> the same sack, when they're in reality two completely different beasts.  
> I'll quote below some paragraphs of some new page I'm writing, which 
> will show why.
> 
> The rationale behind GCC's warning is also fundamentally wrong.  Martin 
> was wrong when he claimed that the right call for strncat(3) is the 
> remaining space in the destination.
> 
> I admit that I didn't know what strncat(3) was useful for, and believed 
> that it was simply a broken-by-design function until very recently (this 
> week, more or less).  And to be honest, I still believe it's broken by 
> design; it's just that it can be repurposed for a reasonable new purpose 
> (which I found while digging in groff's source code; that's why the CC).
> 
> 
> First I'll show an example program that I added to the strncat(3) manual 
> page last week, which is based on the groff code that used it:
> 
> 
>         #include <stdio.h>
>         #include <stdlib.h>
>         #include <string.h>
> 
>         #define nitems(arr)  (sizeof((arr)) / sizeof((arr)[0]))
> 
>         int
>         main(void)
>         {
>             char pre[4] = "pre.";
>             char *post = ".post";
>             char *src = "some_long_body.post";
>             char dest[100];
> 
>             dest[0] = '\0';
>             strncat(dest, pre, nitems(pre));
>             strncat(dest, src, strlen(src) - strlen(post));
> 
>             puts(dest);  // "pre.some_long_body"
>             exit(EXIT_SUCCESS);
>         }
> 
> 
> And now I'll quote some text that I'm writing currently for the function:
> 
> 
>     Null‐padded character sequences
>         For historic reasons, some standard APIs, such as utmpx(5),  
> use  null‐
>         padded  character  sequences in fixed‐width buffers.  To 
> interface with
>         them, specialized functions need to be used.
> 
>         To copy strings into them, use stpncpy(3).
> 
>         To copy from an unterminated string within a fixed‐width buffer 
> into  a
>         string,  ignoring  any  trailing  null  bytes in the source 
> fixed‐width
>         buffer, you should use strncat(3).
> 
>         [...]
> 
>         stpncpy(3)
>                This function copies the input string  into  a  destination
>                null‐padded character sequence in a fixed‐width buffer.  If
>                the  destination  buffer,  limited by its size, isn’t large
>                enough to hold the copy, the resulting  character  sequence
>                is  truncated.   Since  it creates a character sequence, it
>                doesn’t need to write a terminating null byte.  It’s impos‐
>                sible to distinguish truncation  after  the  call,  from  a
>                character  sequence  that just fits the destination buffer;
>                truncation should be detected from the length of the origi‐
>                nal string.
> 
>         strncpy(3)
>                This function is identical to  stpncpy(3)  except  for  the
>                useless return value.
> 
>                stpncpy(3) is a simpler alternative to this function.
> 
>         [...]
> 
>         strncat(3)
>                Do not confuse this function with strncpy(3); they are  not
>                related at all.
> 
>                This  function  catenates the input character sequence con‐
>                tained in a null‐padded wixed‐width buffer, into a destina‐
>                tion string.  The programmer is responsible for  allocating
>                a buffer large enough.  The return value is useless.
> 
>                zustr2stp(3) is a faster alternative to this function.
> 
>                An implementation of this function might be:
> 
>                    char *
>                    strncat(char *restrict dst, const char *restrict src,
>                            size_t sz)
>                    {
>                        int   len;
>                        char  *end;
> 
>                        len = strnlen(src, sz);
>                        end = dst + strlen(dst);
>                        end = mempcpy(end, src, len);
>                        *end = '\0';
> 
>                        return dst;
>                    }
> 
> 
> Cheers,
> 
> Alex
> 
> 
> 
>>
>> Thanks,
>> Andrew Pinski
>>
>>>
>>> Cheers,
>>>
>>> Alex
>>>
>>>
>>> -- 
>>> <http://www.alejandro-colomar.es/>
> 


  reply	other threads:[~2022-12-15 20:50 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-14 22:45 Alejandro Colomar
2022-12-14 22:51 ` Alejandro Colomar
2022-12-14 22:51   ` Alejandro Colomar
2022-12-14 22:57 ` Andrew Pinski
2022-12-14 23:14   ` Alejandro Colomar
2022-12-15 20:50     ` Martin Sebor [this message]
2022-12-15 22:03       ` Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=746bce22-9b6f-9452-5f0e-0e257738a03d@gmail.com \
    --to=msebor@gmail.com \
    --cc=alx.manpages@gmail.com \
    --cc=gcc@gcc.gnu.org \
    --cc=groff@gnu.org \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-man@vger.kernel.org \
    --cc=msebor@redhat.com \
    --cc=pinskia@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).