From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by sourceware.org (Postfix) with ESMTPS id DBDAE3858D32 for ; Fri, 10 Nov 2023 23:44:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DBDAE3858D32 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=jguk.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=jguk.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DBDAE3858D32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::335 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699659875; cv=none; b=Hz0eIj32s1CL57irYq5QA5WoTnFZo+/pV76+29nXS++72cKca81XXUNEaX8sorB24oELEzPCWskvS/wxFsbdtp+Vl7o1REVSVTNp8EuDIj0nrtTNzPWbUwi9+2JrSkapY5XNuQIGFNR9ibP3ZkN6OJXgGxJatelR+bwcvBAayO8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699659875; c=relaxed/simple; bh=gamWT3SzfG/rcgqqh0erVDm3TrVKD6rJH8ytvc6yQvo=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=psXiBapE/pj2k/dv5zDCLEe/tXKnCe7DtU4cDCLO9HPOWdcTCUhbGn7IpJIEfCLcmx/OBHE1NhOTvqFm2mYs3gavqtGgzoGNUA/coE9+wW9kav7wGjYG0TfMGLXMVbChrq1B9RosCBXuFjo6Cl89z089lGkJfJQ9PvO+c+YSWfE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x335.google.com with SMTP id 5b1f17b1804b1-40a46ea95f0so8541565e9.2 for ; Fri, 10 Nov 2023 15:44:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jguk.org; s=google; t=1699659872; x=1700264672; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=bsiM/Frfga0gfj7lsX65hhyovVfXZ9CO9O6ACEnfuTk=; b=GwP3n3WohsslvdI3wQ7nTOvEBL0IVULTkWpDq6PxvHFJyvbrOHmQSQDGk3ypavPpy8 AKGX7NgGGtZEh4YlkabrOnR2hsLNbtBvNT6E+Gvtqp3TIAK7NvqCR9UIMYPTO26sX+LG t6bde7ysUP2hEOsNwL6foK7RkJtM0WdAJl08vwbcbuxJjWc02mlIgFTqhegBmx1nCpeq mWR240NfnfKSN+TiilN/d1lx4SD42DIK7FBEnyepu6uvQZbfqihxWV8nq+MBd8p2faHC XlFqU18H8ZpkNvtx7VpYWmbXhxjV+/STzWYMf6IEAK+ZL98xr4pwB0QlIXKp0lkSnWcB XF1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699659872; x=1700264672; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bsiM/Frfga0gfj7lsX65hhyovVfXZ9CO9O6ACEnfuTk=; b=q5XTVl9HBXC6OtheNkqRUwf8mTcPWriWdkvQgMcGMPCIWPiPD0J4cur0Nuc/hvskh6 HahmVB7JYQIP2MXn5u2xBxJJmkyA6+qVWAHN+qJ2uTqHcP6FFD2PoI5PEoKg5wK3QEWt 9M+GR+bIlBq9WyDy95ssAxzhIsEsyb13S1s7XOwXkfu1xWc3/T0HTg4nhF0w6PItypKd VbdPPAAU52Rl91fhOqp5Keh0rOG6i1nSneOjKJS4vFAQSqThPsOJfqvlYejOBltQHYv1 BhI6pgKX6CE0iT9Zk1sEuWC7LH9MGJW0e27Tg5ZQRNhHHfamnO9bEgjZ6j0dJPOuym5G 1ZFA== X-Gm-Message-State: AOJu0Yx+yWUPXlpMUc3hwpdyurbcHlQJ2kN2ehZwefIfJb2nxsBRbAa+ 80fAotxiA1llTsJbWdDXnYYP5Q== X-Google-Smtp-Source: AGHT+IEfeaQSUwOAOTIgYER9/wdKRnoUlY6ycX19xihRkDILoRP4F77v42jqH6nfA9EJcaDcLtpqVw== X-Received: by 2002:a05:600c:4f4a:b0:40a:4432:6b01 with SMTP id m10-20020a05600c4f4a00b0040a44326b01mr540399wmq.20.1699659872338; Fri, 10 Nov 2023 15:44:32 -0800 (PST) Received: from [192.168.0.12] (cpc87345-slou4-2-0-cust172.17-4.cable.virginm.net. [81.101.252.173]) by smtp.gmail.com with ESMTPSA id a17-20020a05600c349100b003fee6e170f9sm667448wmq.45.2023.11.10.15.44.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 10 Nov 2023 15:44:31 -0800 (PST) Message-ID: <79020a8a-d499-4040-92e7-ecb63965a4e8@jguk.org> Date: Fri, 10 Nov 2023 23:44:31 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: strncpy clarify result may not be null terminated Content-Language: en-GB To: Alejandro Colomar , Paul Eggert Cc: Matthew House , linux-man , GNU C Library References: <20231109031345.245703-1-mattlloydhouse@gmail.com> <250e0401-2eaa-461f-ae20-a7f44d0bc5ad@jguk.org> From: Jonny Grant In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/11/2023 20:19, Alejandro Colomar wrote: > Hi Paul, > > On Fri, Nov 10, 2023 at 07:36:33PM +0100, Alejandro Colomar wrote: >> Hi Paul, >> >> >> On Fri, Nov 10, 2023 at 09:58:42AM -0800, Paul Eggert wrote: >>> On 2023-11-10 03:05, Alejandro Colomar wrote: >>>> Hopefully, it won't be so bad in terms of performance. >>> >>> It's significantly slower than strncpy for typical use (smallish fixed-size >>> destination buffers). So just use strncpy for that. It may be bad, but it's >>> better than the alternatives you've mentioned. You can package strncpy >>> inside a [[nodiscard]] inline wrapper if you like. >>> >>> More importantly, the manual should not push strlcpy as being superior or >>> being in any way a "fix" for strncpy's problems. strlcpy is worse than >>> strncpy in important ways and besides - as mentioned in the glibc manual - >>> neither function is a good choice for string processing. >> >> Hmmmm, that sounds convincing. How about this as a starting point? > > Something slightly better: > > diff --git a/man3/stpncpy.3 b/man3/stpncpy.3 > index 3cf4eb371..8ffedae01 100644 > --- a/man3/stpncpy.3 > +++ b/man3/stpncpy.3 > @@ -67,6 +67,88 @@ .SH DESCRIPTION > } > .EE > .in > +.\" > +.SS Producing a string in a fixed-width buffer > +Programs should normally avoid arbitrary string limitations. > +However, some programs may need to write strings into fixed-width buffers. > +.P > +Although this function wasn't designed to produce a string, > +it can be used with appropriate care for that purpose. > +There are two main cases where it can be useful: > +.IP \[bu] 3 > +Copying a string into a new string in a fixed-width buffer, > +preventing buffer overflow. > +.IP \[bu] > +Copying a string into a new string in a fixed-width buffer, > +with truncation. > +.P > +Using > +.BR strncpy (3) > +in any of those cases is prone to several classes of bugs, > +so it is recommended that you write a wrapper function > +that encloses all the dangers. Some feedback about last line: "that covers all the risks" is clearer. > +.TP > +Copying a string preventing buffer overflow > +.in +4n > +.EX > +[[nodiscard]] > +inline ssize_t > +strxcpy(char *restrict dst, const char *restrict src, char dsize) > +{ > + char *p; > + > + if (dsize == 0) > + return -1; > + > + p = stpncpy(dst, src, dsize); > + if (dst[dsize - 1] != '\0') > + return -1; > + > + return p - dst; > +} > +.EE > +.in > +.P > +If it returns -1, > +the contents of > +.I dst > +are undefined, > +and the program should handle the error. > +.P > +You could implement a similar function in terms of > +.BR strlen (3) > +and > +.BR memcpy (3), > +or in terms of > +.BR strlcpy (3), > +and it would be simpler, > +but this implementation is faster. I suggest to add a little more information, could append "because it accesses less memory". > +.\" > +.TP > +Copying a string with truncation > +Truncation is almost always a bug. > +However, in the few cases where it is not a bug, > +you can use the following function. > +.in +4n > +.EX > +inline ssize_t > +strtcpy(char *restrict dst, const char *restrict src, char dsize) > +{ > + char *p; > + > + if (dsize == 0) > + return -1; > + > + p = stpncpy(dst, src, dsize); > + if (dst[dsize - 1] != '\0') { > + dst[dsize - 1] = '\0'; > + p--; > + } > + > + return p - dst; > +} > +.EE > +.in > .SH RETURN VALUE > .TP > .BR strncpy () > > > However, note how many branches we need to make a function that handles > all corner cases. Is it still faster than strnlen+memcpy? stpncpy must > be heavily optimized for that. Also, strnlen(3) might be optimized out > by the compiler in many cases, so maybe in real code it would be better > to use memcpy. I'd very much like to see some numbers. A benchmark test would show performance. Can't be that many lines of code in a loop to measure this. strnlen_s is in the C standard Annex K, but strnlen didn't make it in yet, even C23. Kind regards Jonny