public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [-Wstringop-overflow=] strncat(3)
@ 2022-12-14 22:45 Alejandro Colomar
  2022-12-14 22:51 ` Alejandro Colomar
  2022-12-14 22:57 ` Andrew Pinski
  0 siblings, 2 replies; 7+ messages in thread
From: Alejandro Colomar @ 2022-12-14 22:45 UTC (permalink / raw)
  To: gcc; +Cc: linux-man, GNU C Library


[-- Attachment #1.1: Type: text/plain, Size: 3516 bytes --]

Hi,

I was rewriting the strncat(3) manual page, and when I tried to compile the 
example program, I got a surprise from the compiler.

Here goes the page:


   strncat(3)               Library Functions Manual              strncat(3)

   NAME
          strncat  -  concatenate  a  null‐padded  character sequence into a
          string

   LIBRARY
          Standard C library (libc, -lc)

   SYNOPSIS
          #include <string.h>

          char *strncat(char *restrict dst, const char src[restrict .sz],
                         size_t sz);

   DESCRIPTION
          This function catenates the input character sequence contained  in
          a  null‐padded  fixed‐width  buffer,  into  a string at the buffer
          pointed to by dst.  The programmer is responsible for allocating a
          buffer large enough, that is, strlen(dst) + strnlen(src, sz) + 1.

          An implementation of this function might be:

              char *
              strncat(char *restrict dst, const char *restrict src, size_t sz)
              {
                  int   len;
                  char  *end;

                  len = strnlen(src, sz);
                  end = dst + strlen(dst);
                  end = mempcpy(end, src, len);
                  *end = '\0';

                  return dst;
              }

   RETURN VALUE
          strncat() returns dest.

   ATTRIBUTES
          [...]

   STANDARDS
          POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.

   CAVEATS
          The  name of this function is confusing.  This function has no re‐
          lation with strncpy(3).

          If the destination buffer is not large enough, the behavior is un‐
          defined.  See _FORTIFY_SOURCE in feature_test_macros(7).

   BUGS
          This function  can  be  very  inefficient.   Read  about  Shlemiel
          the       painter      ⟨https://www.joelonsoftware.com/2001/12/11/
          back-to-basics/⟩.

   EXAMPLES
          #include <stdio.h>
          #include <stdlib.h>
          #include <string.h>

          int
          main(void)
          {
              char    buf[BUFSIZ];
              size_t  len;

              buf[0] = '\0';  // There’s no ’cpy’ function to this ’cat’.
              strncat(buf, "Hello ", 6);
              strncat(buf, "world", 42);  // Padding null bytes ignored.
              strncat(buf, "!", 1);
              len = strlen(buf);
              printf("[len = %zu]: <%s>\n", len, buf);

              exit(EXIT_SUCCESS);
          }

   SEE ALSO
          string(3), string_copy(3)

   Linux man‐pages (unreleased)      (date)                       strncat(3)


And when you compile that, you get:

$ cc -Wall -Wextra ./strncat.c
./strncat.c: In function ‘main’:
./strncat.c:12:12: warning: ‘strncat’ specified bound 6 equals source length 
[-Wstringop-overflow=]
    12 |            strncat(buf, "Hello ", 6);
       |            ^~~~~~~~~~~~~~~~~~~~~~~~~
./strncat.c:14:12: warning: ‘strncat’ specified bound 1 equals source length 
[-Wstringop-overflow=]
    14 |            strncat(buf, "!", 1);
       |            ^~~~~~~~~~~~~~~~~~~~


So, what?  Where's the problem?  This function does exactly that: "take an 
unterminated character sequence and catenate it to an existing string".  Clang 
seems to be fine with the code.

Cheers,

Alex


-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [-Wstringop-overflow=] strncat(3)
  2022-12-14 22:45 [-Wstringop-overflow=] strncat(3) Alejandro Colomar
@ 2022-12-14 22:51 ` Alejandro Colomar
  2022-12-14 22:51   ` Alejandro Colomar
  2022-12-14 22:57 ` Andrew Pinski
  1 sibling, 1 reply; 7+ messages in thread
From: Alejandro Colomar @ 2022-12-14 22:51 UTC (permalink / raw)
  To: gcc; +Cc: linux-man, GNU C Library


[-- Attachment #1.1: Type: text/plain, Size: 4583 bytes --]



On 12/14/22 23:45, Alejandro Colomar wrote:
> Hi,
> 
> I was rewriting the strncat(3) manual page, and when I tried to compile the 
> example program, I got a surprise from the compiler.
> 
> Here goes the page:
> 
> 
>    strncat(3)               Library Functions Manual              strncat(3)
> 
>    NAME
>           strncat  -  concatenate  a  null‐padded  character sequence into a
>           string
> 
>    LIBRARY
>           Standard C library (libc, -lc)
> 
>    SYNOPSIS
>           #include <string.h>
> 
>           char *strncat(char *restrict dst, const char src[restrict .sz],
>                          size_t sz);
> 
>    DESCRIPTION
>           This function catenates the input character sequence contained  in
>           a  null‐padded  fixed‐width  buffer,  into  a string at the buffer
>           pointed to by dst.  The programmer is responsible for allocating a
>           buffer large enough, that is, strlen(dst) + strnlen(src, sz) + 1.
> 
>           An implementation of this function might be:
> 
>               char *
>               strncat(char *restrict dst, const char *restrict src, size_t sz)
>               {
>                   int   len;
>                   char  *end;
> 
>                   len = strnlen(src, sz);
>                   end = dst + strlen(dst);
>                   end = mempcpy(end, src, len);
>                   *end = '\0';
> 
>                   return dst;
>               }
> 
>    RETURN VALUE
>           strncat() returns dest.
> 
>    ATTRIBUTES
>           [...]
> 
>    STANDARDS
>           POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
> 
>    CAVEATS
>           The  name of this function is confusing.  This function has no re‐
>           lation with strncpy(3).
> 
>           If the destination buffer is not large enough, the behavior is un‐
>           defined.  See _FORTIFY_SOURCE in feature_test_macros(7).
> 
>    BUGS
>           This function  can  be  very  inefficient.   Read  about  Shlemiel
>           the       painter      ⟨https://www.joelonsoftware.com/2001/12/11/
>           back-to-basics/⟩.
> 
>    EXAMPLES
>           #include <stdio.h>
>           #include <stdlib.h>
>           #include <string.h>
> 
>           int
>           main(void)
>           {
>               char    buf[BUFSIZ];
>               size_t  len;
> 
>               buf[0] = '\0';  // There’s no ’cpy’ function to this ’cat’.
>               strncat(buf, "Hello ", 6);
>               strncat(buf, "world", 42);  // Padding null bytes ignored.
>               strncat(buf, "!", 1);
>               len = strlen(buf);
>               printf("[len = %zu]: <%s>\n", len, buf);
> 
>               exit(EXIT_SUCCESS);
>           }
> 
>    SEE ALSO
>           string(3), string_copy(3)
> 
>    Linux man‐pages (unreleased)      (date)                       strncat(3)
> 
> 
> And when you compile that, you get:
> 
> $ cc -Wall -Wextra ./strncat.c
> ./strncat.c: In function ‘main’:
> ./strncat.c:12:12: warning: ‘strncat’ specified bound 6 equals source length 
> [-Wstringop-overflow=]
>     12 |            strncat(buf, "Hello ", 6);
>        |            ^~~~~~~~~~~~~~~~~~~~~~~~~
> ./strncat.c:14:12: warning: ‘strncat’ specified bound 1 equals source length 
> [-Wstringop-overflow=]
>     14 |            strncat(buf, "!", 1);
>        |            ^~~~~~~~~~~~~~~~~~~~
> 
> 
> So, what?  Where's the problem?  This function does exactly that: "take an 
> unterminated character sequence and catenate it to an existing string".  Clang 
> seems to be fine with the code.

Maybe it's saying that I should be using strncat(buf, "!"); because the length 
is useless?

> 
> Cheers,
> 
> Alex
> 
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [-Wstringop-overflow=] strncat(3)
  2022-12-14 22:51 ` Alejandro Colomar
@ 2022-12-14 22:51   ` Alejandro Colomar
  0 siblings, 0 replies; 7+ messages in thread
From: Alejandro Colomar @ 2022-12-14 22:51 UTC (permalink / raw)
  To: gcc; +Cc: linux-man, GNU C Library


[-- Attachment #1.1: Type: text/plain, Size: 4839 bytes --]



On 12/14/22 23:51, Alejandro Colomar wrote:
> 
> 
> On 12/14/22 23:45, Alejandro Colomar wrote:
>> Hi,
>>
>> I was rewriting the strncat(3) manual page, and when I tried to compile the 
>> example program, I got a surprise from the compiler.
>>
>> Here goes the page:
>>
>>
>>    strncat(3)               Library Functions Manual              strncat(3)
>>
>>    NAME
>>           strncat  -  concatenate  a  null‐padded  character sequence into a
>>           string
>>
>>    LIBRARY
>>           Standard C library (libc, -lc)
>>
>>    SYNOPSIS
>>           #include <string.h>
>>
>>           char *strncat(char *restrict dst, const char src[restrict .sz],
>>                          size_t sz);
>>
>>    DESCRIPTION
>>           This function catenates the input character sequence contained  in
>>           a  null‐padded  fixed‐width  buffer,  into  a string at the buffer
>>           pointed to by dst.  The programmer is responsible for allocating a
>>           buffer large enough, that is, strlen(dst) + strnlen(src, sz) + 1.
>>
>>           An implementation of this function might be:
>>
>>               char *
>>               strncat(char *restrict dst, const char *restrict src, size_t sz)
>>               {
>>                   int   len;
>>                   char  *end;
>>
>>                   len = strnlen(src, sz);
>>                   end = dst + strlen(dst);
>>                   end = mempcpy(end, src, len);
>>                   *end = '\0';
>>
>>                   return dst;
>>               }
>>
>>    RETURN VALUE
>>           strncat() returns dest.
>>
>>    ATTRIBUTES
>>           [...]
>>
>>    STANDARDS
>>           POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>>
>>    CAVEATS
>>           The  name of this function is confusing.  This function has no re‐
>>           lation with strncpy(3).
>>
>>           If the destination buffer is not large enough, the behavior is un‐
>>           defined.  See _FORTIFY_SOURCE in feature_test_macros(7).
>>
>>    BUGS
>>           This function  can  be  very  inefficient.   Read  about  Shlemiel
>>           the       painter      ⟨https://www.joelonsoftware.com/2001/12/11/
>>           back-to-basics/⟩.
>>
>>    EXAMPLES
>>           #include <stdio.h>
>>           #include <stdlib.h>
>>           #include <string.h>
>>
>>           int
>>           main(void)
>>           {
>>               char    buf[BUFSIZ];
>>               size_t  len;
>>
>>               buf[0] = '\0';  // There’s no ’cpy’ function to this ’cat’.
>>               strncat(buf, "Hello ", 6);
>>               strncat(buf, "world", 42);  // Padding null bytes ignored.
>>               strncat(buf, "!", 1);
>>               len = strlen(buf);
>>               printf("[len = %zu]: <%s>\n", len, buf);
>>
>>               exit(EXIT_SUCCESS);
>>           }
>>
>>    SEE ALSO
>>           string(3), string_copy(3)
>>
>>    Linux man‐pages (unreleased)      (date)                       strncat(3)
>>
>>
>> And when you compile that, you get:
>>
>> $ cc -Wall -Wextra ./strncat.c
>> ./strncat.c: In function ‘main’:
>> ./strncat.c:12:12: warning: ‘strncat’ specified bound 6 equals source length 
>> [-Wstringop-overflow=]
>>     12 |            strncat(buf, "Hello ", 6);
>>        |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>> ./strncat.c:14:12: warning: ‘strncat’ specified bound 1 equals source length 
>> [-Wstringop-overflow=]
>>     14 |            strncat(buf, "!", 1);
>>        |            ^~~~~~~~~~~~~~~~~~~~
>>
>>
>> So, what?  Where's the problem?  This function does exactly that: "take an 
>> unterminated character sequence and catenate it to an existing string".  Clang 
>> seems to be fine with the code.
> 
> Maybe it's saying that I should be using strncat(buf, "!"); because the length 

oops;  of course, I meant strcat().

> is useless?
> 
>>
>> Cheers,
>>
>> Alex
>>
>>
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [-Wstringop-overflow=] strncat(3)
  2022-12-14 22:45 [-Wstringop-overflow=] strncat(3) Alejandro Colomar
  2022-12-14 22:51 ` Alejandro Colomar
@ 2022-12-14 22:57 ` Andrew Pinski
  2022-12-14 23:14   ` Alejandro Colomar
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Pinski @ 2022-12-14 22:57 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: gcc, linux-man, GNU C Library

On Wed, Dec 14, 2022 at 2:46 PM Alejandro Colomar via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Hi,
>
> I was rewriting the strncat(3) manual page, and when I tried to compile the
> example program, I got a surprise from the compiler.
>
> Here goes the page:
>
>
>    strncat(3)               Library Functions Manual              strncat(3)
>
>    NAME
>           strncat  -  concatenate  a  null‐padded  character sequence into a
>           string
>
>    LIBRARY
>           Standard C library (libc, -lc)
>
>    SYNOPSIS
>           #include <string.h>
>
>           char *strncat(char *restrict dst, const char src[restrict .sz],
>                          size_t sz);
>
>    DESCRIPTION
>           This function catenates the input character sequence contained  in
>           a  null‐padded  fixed‐width  buffer,  into  a string at the buffer
>           pointed to by dst.  The programmer is responsible for allocating a
>           buffer large enough, that is, strlen(dst) + strnlen(src, sz) + 1.
>
>           An implementation of this function might be:
>
>               char *
>               strncat(char *restrict dst, const char *restrict src, size_t sz)
>               {
>                   int   len;
>                   char  *end;
>
>                   len = strnlen(src, sz);
>                   end = dst + strlen(dst);
>                   end = mempcpy(end, src, len);
>                   *end = '\0';
>
>                   return dst;
>               }
>
>    RETURN VALUE
>           strncat() returns dest.
>
>    ATTRIBUTES
>           [...]
>
>    STANDARDS
>           POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>
>    CAVEATS
>           The  name of this function is confusing.  This function has no re‐
>           lation with strncpy(3).
>
>           If the destination buffer is not large enough, the behavior is un‐
>           defined.  See _FORTIFY_SOURCE in feature_test_macros(7).
>
>    BUGS
>           This function  can  be  very  inefficient.   Read  about  Shlemiel
>           the       painter      ⟨https://www.joelonsoftware.com/2001/12/11/
>           back-to-basics/⟩.
>
>    EXAMPLES
>           #include <stdio.h>
>           #include <stdlib.h>
>           #include <string.h>
>
>           int
>           main(void)
>           {
>               char    buf[BUFSIZ];
>               size_t  len;
>
>               buf[0] = '\0';  // There’s no ’cpy’ function to this ’cat’.
>               strncat(buf, "Hello ", 6);
>               strncat(buf, "world", 42);  // Padding null bytes ignored.
>               strncat(buf, "!", 1);
>               len = strlen(buf);
>               printf("[len = %zu]: <%s>\n", len, buf);
>
>               exit(EXIT_SUCCESS);
>           }
>
>    SEE ALSO
>           string(3), string_copy(3)
>
>    Linux man‐pages (unreleased)      (date)                       strncat(3)
>
>
> And when you compile that, you get:
>
> $ cc -Wall -Wextra ./strncat.c
> ./strncat.c: In function ‘main’:
> ./strncat.c:12:12: warning: ‘strncat’ specified bound 6 equals source length
> [-Wstringop-overflow=]
>     12 |            strncat(buf, "Hello ", 6);
>        |            ^~~~~~~~~~~~~~~~~~~~~~~~~
> ./strncat.c:14:12: warning: ‘strncat’ specified bound 1 equals source length
> [-Wstringop-overflow=]
>     14 |            strncat(buf, "!", 1);
>        |            ^~~~~~~~~~~~~~~~~~~~
>
>
> So, what?  Where's the problem?  This function does exactly that: "take an
> unterminated character sequence and catenate it to an existing string".  Clang
> seems to be fine with the code.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83404 and the
background of why the warning was added here:

https://www.us-cert.gov/bsi/articles/knowledge/coding-practices/strncpy-and-strncat.

Thanks,
Andrew Pinski

>
> Cheers,
>
> Alex
>
>
> --
> <http://www.alejandro-colomar.es/>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [-Wstringop-overflow=] strncat(3)
  2022-12-14 22:57 ` Andrew Pinski
@ 2022-12-14 23:14   ` Alejandro Colomar
  2022-12-15 20:50     ` Martin Sebor
  0 siblings, 1 reply; 7+ messages in thread
From: Alejandro Colomar @ 2022-12-14 23:14 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc, linux-man, GNU C Library, groff, Martin Sebor


[-- Attachment #1.1: Type: text/plain, Size: 8569 bytes --]

[CC += groff]

Hi Andrew,

On 12/14/22 23:57, Andrew Pinski wrote:
> On Wed, Dec 14, 2022 at 2:46 PM Alejandro Colomar via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
>>
>> Hi,
>>
>> I was rewriting the strncat(3) manual page, and when I tried to compile the
>> example program, I got a surprise from the compiler.
>>
>> Here goes the page:
>>
>>
>>     strncat(3)               Library Functions Manual              strncat(3)
>>
>>     NAME
>>            strncat  -  concatenate  a  null‐padded  character sequence into a
>>            string
>>
>>     LIBRARY
>>            Standard C library (libc, -lc)
>>
>>     SYNOPSIS
>>            #include <string.h>
>>
>>            char *strncat(char *restrict dst, const char src[restrict .sz],
>>                           size_t sz);
>>
>>     DESCRIPTION
>>            This function catenates the input character sequence contained  in
>>            a  null‐padded  fixed‐width  buffer,  into  a string at the buffer
>>            pointed to by dst.  The programmer is responsible for allocating a
>>            buffer large enough, that is, strlen(dst) + strnlen(src, sz) + 1.
>>
>>            An implementation of this function might be:
>>
>>                char *
>>                strncat(char *restrict dst, const char *restrict src, size_t sz)
>>                {
>>                    int   len;
>>                    char  *end;
>>
>>                    len = strnlen(src, sz);
>>                    end = dst + strlen(dst);
>>                    end = mempcpy(end, src, len);
>>                    *end = '\0';
>>
>>                    return dst;
>>                }
>>
>>     RETURN VALUE
>>            strncat() returns dest.
>>
>>     ATTRIBUTES
>>            [...]
>>
>>     STANDARDS
>>            POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>>
>>     CAVEATS
>>            The  name of this function is confusing.  This function has no re‐
>>            lation with strncpy(3).
>>
>>            If the destination buffer is not large enough, the behavior is un‐
>>            defined.  See _FORTIFY_SOURCE in feature_test_macros(7).
>>
>>     BUGS
>>            This function  can  be  very  inefficient.   Read  about  Shlemiel
>>            the       painter      ⟨https://www.joelonsoftware.com/2001/12/11/
>>            back-to-basics/⟩.
>>
>>     EXAMPLES
>>            #include <stdio.h>
>>            #include <stdlib.h>
>>            #include <string.h>
>>
>>            int
>>            main(void)
>>            {
>>                char    buf[BUFSIZ];
>>                size_t  len;
>>
>>                buf[0] = '\0';  // There’s no ’cpy’ function to this ’cat’.
>>                strncat(buf, "Hello ", 6);
>>                strncat(buf, "world", 42);  // Padding null bytes ignored.
>>                strncat(buf, "!", 1);
>>                len = strlen(buf);
>>                printf("[len = %zu]: <%s>\n", len, buf);
>>
>>                exit(EXIT_SUCCESS);
>>            }
>>
>>     SEE ALSO
>>            string(3), string_copy(3)
>>
>>     Linux man‐pages (unreleased)      (date)                       strncat(3)
>>
>>
>> And when you compile that, you get:
>>
>> $ cc -Wall -Wextra ./strncat.c
>> ./strncat.c: In function ‘main’:
>> ./strncat.c:12:12: warning: ‘strncat’ specified bound 6 equals source length
>> [-Wstringop-overflow=]
>>      12 |            strncat(buf, "Hello ", 6);
>>         |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>> ./strncat.c:14:12: warning: ‘strncat’ specified bound 1 equals source length
>> [-Wstringop-overflow=]
>>      14 |            strncat(buf, "!", 1);
>>         |            ^~~~~~~~~~~~~~~~~~~~
>>
>>
>> So, what?  Where's the problem?  This function does exactly that: "take an
>> unterminated character sequence and catenate it to an existing string".  Clang
>> seems to be fine with the code.
> 
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83404 and the
> background of why the warning was added here:
> 
> https://www.us-cert.gov/bsi/articles/knowledge/coding-practices/strncpy-and-strncat.

This document is bogus, since it's puting strncpy(3) and strncat(3) in the same 
sack, when they're in reality two completely different beasts.  I'll quote below 
some paragraphs of some new page I'm writing, which will show why.

The rationale behind GCC's warning is also fundamentally wrong.  Martin was 
wrong when he claimed that the right call for strncat(3) is the remaining space 
in the destination.

I admit that I didn't know what strncat(3) was useful for, and believed that it 
was simply a broken-by-design function until very recently (this week, more or 
less).  And to be honest, I still believe it's broken by design; it's just that 
it can be repurposed for a reasonable new purpose (which I found while digging 
in groff's source code; that's why the CC).


First I'll show an example program that I added to the strncat(3) manual page 
last week, which is based on the groff code that used it:


        #include <stdio.h>
        #include <stdlib.h>
        #include <string.h>

        #define nitems(arr)  (sizeof((arr)) / sizeof((arr)[0]))

        int
        main(void)
        {
            char pre[4] = "pre.";
            char *post = ".post";
            char *src = "some_long_body.post";
            char dest[100];

            dest[0] = '\0';
            strncat(dest, pre, nitems(pre));
            strncat(dest, src, strlen(src) - strlen(post));

            puts(dest);  // "pre.some_long_body"
            exit(EXIT_SUCCESS);
        }


And now I'll quote some text that I'm writing currently for the function:


    Null‐padded character sequences
        For historic reasons, some standard APIs, such as utmpx(5),  use  null‐
        padded  character  sequences in fixed‐width buffers.  To interface with
        them, specialized functions need to be used.

        To copy strings into them, use stpncpy(3).

        To copy from an unterminated string within a fixed‐width buffer into  a
        string,  ignoring  any  trailing  null  bytes in the source fixed‐width
        buffer, you should use strncat(3).

        [...]

        stpncpy(3)
               This function copies the input string  into  a  destination
               null‐padded character sequence in a fixed‐width buffer.  If
               the  destination  buffer,  limited by its size, isn’t large
               enough to hold the copy, the resulting  character  sequence
               is  truncated.   Since  it creates a character sequence, it
               doesn’t need to write a terminating null byte.  It’s impos‐
               sible to distinguish truncation  after  the  call,  from  a
               character  sequence  that just fits the destination buffer;
               truncation should be detected from the length of the origi‐
               nal string.

        strncpy(3)
               This function is identical to  stpncpy(3)  except  for  the
               useless return value.

               stpncpy(3) is a simpler alternative to this function.

        [...]

        strncat(3)
               Do not confuse this function with strncpy(3); they are  not
               related at all.

               This  function  catenates the input character sequence con‐
               tained in a null‐padded wixed‐width buffer, into a destina‐
               tion string.  The programmer is responsible for  allocating
               a buffer large enough.  The return value is useless.

               zustr2stp(3) is a faster alternative to this function.

               An implementation of this function might be:

                   char *
                   strncat(char *restrict dst, const char *restrict src,
                           size_t sz)
                   {
                       int   len;
                       char  *end;

                       len = strnlen(src, sz);
                       end = dst + strlen(dst);
                       end = mempcpy(end, src, len);
                       *end = '\0';

                       return dst;
                   }


Cheers,

Alex



> 
> Thanks,
> Andrew Pinski
> 
>>
>> Cheers,
>>
>> Alex
>>
>>
>> --
>> <http://www.alejandro-colomar.es/>

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [-Wstringop-overflow=] strncat(3)
  2022-12-14 23:14   ` Alejandro Colomar
@ 2022-12-15 20:50     ` Martin Sebor
  2022-12-15 22:03       ` Alejandro Colomar
  0 siblings, 1 reply; 7+ messages in thread
From: Martin Sebor @ 2022-12-15 20:50 UTC (permalink / raw)
  To: Alejandro Colomar, Andrew Pinski
  Cc: gcc, linux-man, GNU C Library, groff, Martin Sebor

On 12/14/22 16:14, Alejandro Colomar via Libc-alpha wrote:
> [CC += groff]
> 
> Hi Andrew,
> 
> On 12/14/22 23:57, Andrew Pinski wrote:
>> On Wed, Dec 14, 2022 at 2:46 PM Alejandro Colomar via Libc-alpha
>> <libc-alpha@sourceware.org> wrote:
>>>
>>> Hi,
>>>
>>> I was rewriting the strncat(3) manual page, and when I tried to 
>>> compile the
>>> example program, I got a surprise from the compiler.
>>>
>>> Here goes the page:
>>>
>>>
>>>     strncat(3)               Library Functions Manual              
>>> strncat(3)
>>>
>>>     NAME
>>>            strncat  -  concatenate  a  null‐padded  character 
>>> sequence into a
>>>            string
>>>
>>>     LIBRARY
>>>            Standard C library (libc, -lc)
>>>
>>>     SYNOPSIS
>>>            #include <string.h>
>>>
>>>            char *strncat(char *restrict dst, const char src[restrict 
>>> .sz],
>>>                           size_t sz);
>>>
>>>     DESCRIPTION
>>>            This function catenates the input character sequence 
>>> contained  in
>>>            a  null‐padded  fixed‐width  buffer,  into  a string at 
>>> the buffer
>>>            pointed to by dst.  The programmer is responsible for 
>>> allocating a
>>>            buffer large enough, that is, strlen(dst) + strnlen(src, 
>>> sz) + 1.
>>>
>>>            An implementation of this function might be:
>>>
>>>                char *
>>>                strncat(char *restrict dst, const char *restrict src, 
>>> size_t sz)
>>>                {
>>>                    int   len;
>>>                    char  *end;
>>>
>>>                    len = strnlen(src, sz);
>>>                    end = dst + strlen(dst);
>>>                    end = mempcpy(end, src, len);
>>>                    *end = '\0';
>>>
>>>                    return dst;
>>>                }
>>>
>>>     RETURN VALUE
>>>            strncat() returns dest.
>>>
>>>     ATTRIBUTES
>>>            [...]
>>>
>>>     STANDARDS
>>>            POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>>>
>>>     CAVEATS
>>>            The  name of this function is confusing.  This function 
>>> has no re‐
>>>            lation with strncpy(3).
>>>
>>>            If the destination buffer is not large enough, the 
>>> behavior is un‐
>>>            defined.  See _FORTIFY_SOURCE in feature_test_macros(7).
>>>
>>>     BUGS
>>>            This function  can  be  very  inefficient.   Read  about  
>>> Shlemiel
>>>            the       painter      
>>> ⟨https://www.joelonsoftware.com/2001/12/11/
>>>            back-to-basics/⟩.
>>>
>>>     EXAMPLES
>>>            #include <stdio.h>
>>>            #include <stdlib.h>
>>>            #include <string.h>
>>>
>>>            int
>>>            main(void)
>>>            {
>>>                char    buf[BUFSIZ];
>>>                size_t  len;
>>>
>>>                buf[0] = '\0';  // There’s no ’cpy’ function to this 
>>> ’cat’.
>>>                strncat(buf, "Hello ", 6);

There's nothing wrong with this but the two lines above would be
more simply coded as one:

   strcpy(buf, "Hello ");

The original code suggests a misunderstanding of strncpy's purpose:
that it writes exactly 6 bytes into the destination.  That's what
the warning points out.

>>>                strncat(buf, "world", 42);  // Padding null bytes 
>>> ignored.
>>>                strncat(buf, "!", 1);
>>>                len = strlen(buf);
>>>                printf("[len = %zu]: <%s>\n", len, buf);
>>>
>>>                exit(EXIT_SUCCESS);
>>>            }
>>>
>>>     SEE ALSO
>>>            string(3), string_copy(3)
>>>
>>>     Linux man‐pages (unreleased)      (date)                       
>>> strncat(3)
>>>
>>>
>>> And when you compile that, you get:
>>>
>>> $ cc -Wall -Wextra ./strncat.c
>>> ./strncat.c: In function ‘main’:
>>> ./strncat.c:12:12: warning: ‘strncat’ specified bound 6 equals source 
>>> length
>>> [-Wstringop-overflow=]
>>>      12 |            strncat(buf, "Hello ", 6);
>>>         |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>>> ./strncat.c:14:12: warning: ‘strncat’ specified bound 1 equals source 
>>> length
>>> [-Wstringop-overflow=]
>>>      14 |            strncat(buf, "!", 1);
>>>         |            ^~~~~~~~~~~~~~~~~~~~
>>>
>>>
>>> So, what?  Where's the problem?  This function does exactly that: 
>>> "take an
>>> unterminated character sequence and catenate it to an existing 
>>> string".

Strncat has historically had two distinct use cases.  One of them
-- to constrain the amount of data to copy to the space remaining
in the destination -- gained popularity with the push to reduce
buffer overflow weaknesses in code.  Mistakes in these uses gave
rise to a whole other class of security bugs, to the extent that
CERT felt it necessary to publish the strncpy and strncat best
practice.  The GCC warning in turn was added to support the CERT
guideline.  I touch on some of this in a blog post I wrote a few
years ago:

https://developers.redhat.com/blog/2018/05/24/detecting-string-truncation-with-gcc-8

The specific uses of the function above are contrived (there's
no point in calling strncat to append the full string -- strcat
will do that more clearly and efficiently) but the general use
case -- limiting the amount of copied data to an initial
substring of the source sequence -- although valid and originally
intended (it's one of the two uses of the function in UNIX v7),
is not one that either the guideline or the warning consider.
They can only consider one use cases, and they chose the one
that was observed behind security bugs.  That choice unavoidably
leads to some false positives.  The expected way to deal with
them is to suppress the warning by one of the usual mechanisms
(command line option or #pragma GCC diagnostic).

Martin

>>>  Clang
>>> seems to be fine with the code.
>>
>> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83404 and the
>> background of why the warning was added here:
>>
>> https://www.us-cert.gov/bsi/articles/knowledge/coding-practices/strncpy-and-strncat. 
>>
> 
> This document is bogus, since it's puting strncpy(3) and strncat(3) in 
> the same sack, when they're in reality two completely different beasts.  
> I'll quote below some paragraphs of some new page I'm writing, which 
> will show why.
> 
> The rationale behind GCC's warning is also fundamentally wrong.  Martin 
> was wrong when he claimed that the right call for strncat(3) is the 
> remaining space in the destination.
> 
> I admit that I didn't know what strncat(3) was useful for, and believed 
> that it was simply a broken-by-design function until very recently (this 
> week, more or less).  And to be honest, I still believe it's broken by 
> design; it's just that it can be repurposed for a reasonable new purpose 
> (which I found while digging in groff's source code; that's why the CC).
> 
> 
> First I'll show an example program that I added to the strncat(3) manual 
> page last week, which is based on the groff code that used it:
> 
> 
>         #include <stdio.h>
>         #include <stdlib.h>
>         #include <string.h>
> 
>         #define nitems(arr)  (sizeof((arr)) / sizeof((arr)[0]))
> 
>         int
>         main(void)
>         {
>             char pre[4] = "pre.";
>             char *post = ".post";
>             char *src = "some_long_body.post";
>             char dest[100];
> 
>             dest[0] = '\0';
>             strncat(dest, pre, nitems(pre));
>             strncat(dest, src, strlen(src) - strlen(post));
> 
>             puts(dest);  // "pre.some_long_body"
>             exit(EXIT_SUCCESS);
>         }
> 
> 
> And now I'll quote some text that I'm writing currently for the function:
> 
> 
>     Null‐padded character sequences
>         For historic reasons, some standard APIs, such as utmpx(5),  
> use  null‐
>         padded  character  sequences in fixed‐width buffers.  To 
> interface with
>         them, specialized functions need to be used.
> 
>         To copy strings into them, use stpncpy(3).
> 
>         To copy from an unterminated string within a fixed‐width buffer 
> into  a
>         string,  ignoring  any  trailing  null  bytes in the source 
> fixed‐width
>         buffer, you should use strncat(3).
> 
>         [...]
> 
>         stpncpy(3)
>                This function copies the input string  into  a  destination
>                null‐padded character sequence in a fixed‐width buffer.  If
>                the  destination  buffer,  limited by its size, isn’t large
>                enough to hold the copy, the resulting  character  sequence
>                is  truncated.   Since  it creates a character sequence, it
>                doesn’t need to write a terminating null byte.  It’s impos‐
>                sible to distinguish truncation  after  the  call,  from  a
>                character  sequence  that just fits the destination buffer;
>                truncation should be detected from the length of the origi‐
>                nal string.
> 
>         strncpy(3)
>                This function is identical to  stpncpy(3)  except  for  the
>                useless return value.
> 
>                stpncpy(3) is a simpler alternative to this function.
> 
>         [...]
> 
>         strncat(3)
>                Do not confuse this function with strncpy(3); they are  not
>                related at all.
> 
>                This  function  catenates the input character sequence con‐
>                tained in a null‐padded wixed‐width buffer, into a destina‐
>                tion string.  The programmer is responsible for  allocating
>                a buffer large enough.  The return value is useless.
> 
>                zustr2stp(3) is a faster alternative to this function.
> 
>                An implementation of this function might be:
> 
>                    char *
>                    strncat(char *restrict dst, const char *restrict src,
>                            size_t sz)
>                    {
>                        int   len;
>                        char  *end;
> 
>                        len = strnlen(src, sz);
>                        end = dst + strlen(dst);
>                        end = mempcpy(end, src, len);
>                        *end = '\0';
> 
>                        return dst;
>                    }
> 
> 
> Cheers,
> 
> Alex
> 
> 
> 
>>
>> Thanks,
>> Andrew Pinski
>>
>>>
>>> Cheers,
>>>
>>> Alex
>>>
>>>
>>> -- 
>>> <http://www.alejandro-colomar.es/>
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [-Wstringop-overflow=] strncat(3)
  2022-12-15 20:50     ` Martin Sebor
@ 2022-12-15 22:03       ` Alejandro Colomar
  0 siblings, 0 replies; 7+ messages in thread
From: Alejandro Colomar @ 2022-12-15 22:03 UTC (permalink / raw)
  To: Martin Sebor, Andrew Pinski
  Cc: gcc, linux-man, GNU C Library, groff, Martin Sebor


[-- Attachment #1.1: Type: text/plain, Size: 7933 bytes --]

Hi Martin,

On 12/15/22 21:50, Martin Sebor wrote:
> On 12/14/22 16:14, Alejandro Colomar via Libc-alpha wrote:
[...]

>>>>
>>>>            int
>>>>            main(void)
>>>>            {
>>>>                char    buf[BUFSIZ];
>>>>                size_t  len;
>>>>
>>>>                buf[0] = '\0';  // There’s no ’cpy’ function to this ’cat’.
>>>>                strncat(buf, "Hello ", 6);
> 
> There's nothing wrong with this but the two lines above would be
> more simply coded as one:
> 
>    strcpy(buf, "Hello ");

I used a string literal to avoid having to think hard of an example using 
utmpx(5), which would be the actual code for which I want users to use this 
function.

With time, I expect to develop such example program.

> 
> The original code suggests a misunderstanding of strncpy's purpose:

By the "original code" what are you exactly referring to?  The code quoted 
above, or another?

In any case, the code is using strncat(3) not strncpy(3).

> that it writes exactly 6 bytes into the destination.

That's the purpose of strncpy(3).  I'll quote the versions of the pages that I'm 
proposing:


strncpy(3):
        These functions copy the string pointed to by src  into  a  null‐padded
        character sequence at the fixed‐width buffer pointed to by dst.  If the
        destination buffer, limited by its size, isn’t large enough to hold the
        copy,  the resulting character sequence is truncated.  They only differ
        in the return value.


The description above says that the destination buffer is limited by its size, 
so as you say, this function copies exactly 6 bytes (the character sequence + 
null padding).


strncat(3)
        This  function  catenates  the  input character sequence contained in a
        null‐padded fixed‐width buffer, into a string at the buffer pointed  to
        by  dst.   The  programmer is responsible for allocating a buffer large
        enough, that is, strlen(dst) + strnlen(src, sz) + 1.


But strncat(3) writes whatever is the length of the character sequence passed as 
input (ignoring the padding null bytes), plus a terminating null byte.


>  That's what
> the warning points out.

The warning would be fair for strncpy(3), but not for strncat(3).

> 
>>>>                strncat(buf, "world", 42);  // Padding null bytes ignored.
>>>>                strncat(buf, "!", 1);
>>>>                len = strlen(buf);
>>>>                printf("[len = %zu]: <%s>\n", len, buf);
>>>>
>>>>                exit(EXIT_SUCCESS);
>>>>            }
>>>>
>>>>     SEE ALSO
>>>>            string(3), string_copy(3)
>>>>
>>>>     Linux man‐pages (unreleased)      (date) strncat(3)
>>>>
>>>>
>>>> And when you compile that, you get:
>>>>
>>>> $ cc -Wall -Wextra ./strncat.c
>>>> ./strncat.c: In function ‘main’:
>>>> ./strncat.c:12:12: warning: ‘strncat’ specified bound 6 equals source length
>>>> [-Wstringop-overflow=]
>>>>      12 |            strncat(buf, "Hello ", 6);
>>>>         |            ^~~~~~~~~~~~~~~~~~~~~~~~~
>>>> ./strncat.c:14:12: warning: ‘strncat’ specified bound 1 equals source length
>>>> [-Wstringop-overflow=]
>>>>      14 |            strncat(buf, "!", 1);
>>>>         |            ^~~~~~~~~~~~~~~~~~~~
>>>>
>>>>
>>>> So, what?  Where's the problem?  This function does exactly that: "take an
>>>> unterminated character sequence and catenate it to an existing string".
> 
> Strncat has historically had two distinct use cases.  One of them
> -- to constrain the amount of data to copy to the space remaining
> in the destination -- gained popularity with the push to reduce
> buffer overflow weaknesses in code.  Mistakes in these uses gave
> rise to a whole other class of security bugs, to the extent that
> CERT felt it necessary to publish the strncpy and strncat best
> practice.  The GCC warning in turn was added to support the CERT
> guideline.  I touch on some of this in a blog post I wrote a few
> years ago:
> 
> https://developers.redhat.com/blog/2018/05/24/detecting-string-truncation-with-gcc-8

While this is true, that's because there was nothing more appropriate, such as 
strlcat(3).

My point is that strncat(3) is far from being a good tool for that, and should 
never be used for that purpose in good code; especially now that we have 
strlcat(3) widely available in most if not all POSIX systems (in Linux through 
libbsd).

I'll quote some part of the link above:

|    strncat (dest, src, dest_size - strlen (dest) - 1);
|
| Calls that have this form are not diagnosed.

That code is a source of bugs.  Such code should _always_ be replaced by 
strlcat(3) as:

     strlcat(dst, src, sizeof(dst);


| Other calls, such as those where
| the size is derived in some way from the size or length of the source string,
| are diagnosed by -Wstringop-overflow. That includes unsafe calls like

The only sane use of strncat(3) is that.  It's a use where no other function 
serves better.  To read a possibly non-terminated character sequence from a 
fixed-width buffer; and the size is the only way strncat(3) can know when to 
stop reading.

|
|    strncat (dest, src, strlen (src));   // incorrect - warning

This call doesn't make any sense; I agree.  It should _always_ be replaced by 
strcat(3) as:

     strcat(dst, src);

I acknowledge that this is the case I show in the manual page, but only because 
there is no form to create an unterminated string literal (I take this 
opportunity to re-suggest "foo"u as an unterminated string literal).

I'll try to come up with an example program that avoids this.

|
| and
|
|    strncat (dest, src, sizeof src);   // incorrect - warning

This is the _perfect_ use case for strncat(3).  No other function is better.

Let's say you have the following:

char src[5] = {'h', 'e', 'l', 'l', 'o'};

How do you copy that into a string?  strncat(3).  utmpx(5) is one of the 
examples where this happens.

I'll modify the example program to use this, to avoid being misleading.  I'll 
replace string literals by character sequences.


> 
> The specific uses of the function above are contrived (there's
> no point in calling strncat to append the full string -- strcat
> will do that more clearly and efficiently) but the general use
> case -- limiting the amount of copied data to an initial
> substring of the source sequence -- although valid and originally
> intended (it's one of the two uses of the function in UNIX v7),
> is not one that either the guideline or the warning consider.
> They can only consider one use cases, and they chose the one
> that was observed behind security bugs. 

It's sad that strncat(3) has been used so much for limiting the dest copy.  It's 
pretty unsafe for that, we agree.  How about allowing the safe use of it (i.e., 
size derived from source buffer), and warn about uses where the size is derived 
from the dst string?  I'm going to rewrite the manual page in that direction.

> That choice unavoidably
> leads to some false positives.  The expected way to deal with
> them is to suppress the warning by one of the usual mechanisms
> (command line option or #pragma GCC diagnostic).

The only use case that GCC is allowing is better served by strlcpy(3) and 
strlcat(3).  I suggest changing the warning to protect the only natural use case 
for strncat(3), and warn about uses of it supplanting strlcat(3) (and maybe 
suggest using strlcat(3)).

> 
> Martin

Thanks!

Alex

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-12-15 22:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-14 22:45 [-Wstringop-overflow=] strncat(3) Alejandro Colomar
2022-12-14 22:51 ` Alejandro Colomar
2022-12-14 22:51   ` Alejandro Colomar
2022-12-14 22:57 ` Andrew Pinski
2022-12-14 23:14   ` Alejandro Colomar
2022-12-15 20:50     ` Martin Sebor
2022-12-15 22:03       ` Alejandro Colomar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).