From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by sourceware.org (Postfix) with ESMTPS id EFDCE3855025 for ; Sun, 27 Jun 2021 19:26:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EFDCE3855025 Received: by mail-wm1-x330.google.com with SMTP id w13so9799355wmc.3 for ; Sun, 27 Jun 2021 12:26:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:from:subject:message-id:date:user-agent :mime-version:content-language:content-transfer-encoding; bh=HCKUdWXvGhg+djwTyUkQ9Pote1w8va7vU1xMZBIS6OI=; b=uGeIM1+XioT6OcLIY7rQ/hM8mUC2q6EldbUFsg+oAdL1niOnB804BpyG9glxl+jnWd kTRDAbbPylAM0zj8x4YwvDykH6Sy+avj27NNbSdM0W9ohIS1O8+vpVYZpB06krr6lbL+ 5ulKaB/60dYZzcHrh+b6sZspl4/xwoRv9h9CG+Efd0TMWRJel7i8qUhKgNEddpQR4cZZ 5S4DMmxRJN70ngolF5QAyyHzrmrp6FLOEUsrr7MahvVfQV/1Ahp9KSxnubkd6F5TRhrW 8xz9nFNqVlW04cDGd/vju8vnW3MNS+Xe23UXNi/5Uvorg9unTub4dAWzpN3t2p1OOURC VHGQ== X-Gm-Message-State: AOAM530NWOOcNCBJTpV7IhjVVHePu4VYHclcBqXue0w9avlZQ2KuwhZE ol1ecrpAzLR+pRN5OnzrsaQ= X-Google-Smtp-Source: ABdhPJyP95IwRPm5jriUAXLvJ5dCSADyrXgk9RY3jVlV3L+pnkYSs5J2Q9GNlfspQ4WJV0oKp+K8fQ== X-Received: by 2002:a1c:7510:: with SMTP id o16mr22441383wmc.137.1624821998419; Sun, 27 Jun 2021 12:26:38 -0700 (PDT) Received: from [10.8.0.150] ([195.53.121.100]) by smtp.gmail.com with ESMTPSA id n12sm3951985wrs.12.2021.06.27.12.26.37 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 27 Jun 2021 12:26:37 -0700 (PDT) To: glibc Cc: tech@openbsd.org, Christoph Hellwig , "linux-kernel@vger.kernel.org" From: "Alejandro Colomar (man-pages)" Subject: [RFC] strcpys(): New function for copying strings safely Message-ID: <755875ec-baae-6cab-52a8-3c9530db1ce6@gmail.com> Date: Sun, 27 Jun 2021 21:26:36 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Jun 2021 19:26:41 -0000 Hi, I recently re-read the discussions about strscpy() and strlcpy(). https://lwn.net/Articles/612244/ https://sourceware.org/legacy-ml/libc-alpha/2000-08/msg00052.html https://mafford.com/text/the-many-ways-to-copy-a-string-in-c/ https://lkml.org/lkml/2017/7/14/637 https://lwn.net/Articles/659214/ Arguments against strlcpy(): - Inefficiency (compared to memcpy()) - It doesn't report truncation, needing extra checks by the user - It doesn't prevent overrunning the source string (requires a C-string input) Arguments in favor of strlcpy(): - Avoid code bloat - Self-documenting code - Avoid everybody rolling their own strcat() safe variant - Prevent buffer overflows I rolled my own strcpy safer functions some time ago, and after reading those discussions, I decided to propose them for inclusion in glibc. I think they address all of the arguments before (well, efficiency will never be as good as memcpy(), I guess, but a good compiler, and -flto can make it good enough). It reports two kinds of errors: "hard" errors, where the string is invalid, and "soft" errors, where the string is valid but truncated. The method for reporting the errors is an 'int' return value, which is 0 for success, -1 for "hard" error, and 1 for "soft" error. The value of written characters that strcpy() functions typically return, is now moved to a pointer parameter (4th parameter). The order of the parameters has been changed (compared to other strcpy() typicall variants), according to a new principle of the C2x standard, which says that "the size of an array appears before the array". It doesn't require a C string in the input. It only reads up to the size limit provided by the user. I added the _np suffix to the function to mark it as non-portable (similar to existing practice in glibc with non-standard functions). I used 'ssize_t' instead of 'size_t', because I consider unsigned types to be inherently unsafe. See . It is designed so that usage requires the minimum number of lines of code for complete usage (including error handling checks): [[ // When we already checked that 'size' is >= 1 // and truncation is not an issue: strcpys_np(size, dest, src, NULL); // When we ddidn't check the value of 'size', // but truncation is still not an issue: if (strcpys_np(size, dest, src, NULL) == -1) goto handle_hard_error; // When truncation is an issue: if (strcpys_np(size, dest, src, NULL)) goto handle_all_errors; // When truncation is an error, // but it requires a different handling than a "hard" error: status = strcpys_np(size, dest, src, NULL); if (status == 1) goto handle_truncation; if (status) goto handle_hard_error; ]] After any of those samples of code, the string has been copied, and is a valid C-string. Here goes the code (strcpys_np() is defined in terms of strscpy_np(), which similar to the known strscpy(), but with some of the improvements mentioned above, such as using array parameters, and ssize_t): [[ #include #include [[gnu::nonnull]] ssize_t strscpy_np(ssize_t size, char dest[static restrict size], const char src[static restrict size]) { ssize_t len; if (size <= 0) return -1; len = strnlen(src, size - 1); memcpy(dest, src, len); dest[len] = '\0'; return len; } [[gnu::nonnull(2, 3)]] int strcpys_np(ssize_t size, char dest[static restrict size], const char src[static restrict size], ssize_t *restrict len) { ssize_t l; l = strscpy_np(size, dest, src); if (len) *len = l; if (l == -1) return -1; if (l >= size) return 1; return 0; } ]] I may have introduced some bugs right now, as I adapted the code a bit before sending, but I expect it to be free of any bugs known of the existing str*cpy() interfaces. What do you think about this function? Would you want to add it to glibc? Thanks, Alex -- Alejandro Colomar Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/ http://www.alejandro-colomar.es/