From: Noah Goldstein <goldstein.w.n@gmail.com>
To: libc-coord@lists.openwall.com
Cc: GNU C Library <libc-alpha@sourceware.org>,
Richard Biener via Gcc <gcc@gcc.gnu.org>
Subject: Add new ABIs '__strcmpeq', '__strncmpeq', '__wcscmpeq' and '__wcsncmpeq' to libc
Date: Thu, 20 Jan 2022 16:56:59 -0600 [thread overview]
Message-ID: <CAFUsyfKOtCxcM84LAPwmh=h+Kkz7x-CvGvsBFUW-wY1Zjv00eg@mail.gmail.com> (raw)
Hi All,
This is a proposal for four new interfaces to be supported by libc.
This is essentially the same proposal as '__memcmpeq()':
https://sourceware.org/pipermail/libc-alpha/2021-September/131099.html
for the character string and wide-character string comparison
functions.
#### Interfaces ####
int __strcmpeq(const void * s1, const void * s2)
int __strncmpeq(const void * s1, const void * s2, size_t n)
int __wcscmpeq(const wchar_t * ws1, const wchar_t * ws2)
int __wcsncmpeq(const wchar_t * ws1, const wchar_t * ws2, size_t n)
#### Descriptions ####
- 'strcmpeq()'
- The '__strcmpeq()' function shall compare the string pointed to
by 's1' to the string pointed to by 's2'. If the two strings are
equal the return value will be zero. Otherwise the return value
will be some non-zero value. 'strcmp()' is a valid
implementation of '__strcmpeq()'.
- 'strncmpeq()'
- The '__strncmpeq()' function shall compare not more than 'n'
bytes (bytes that follow a null byte are not compared) from the
array pointed to by 's1' to the array pointed to by 's2'. If the
two strings or first 'n' characters are equal the return value
will be zero. Otherwise the return value will be some non-zero
value. 'strncmp()' is a valid implementation of '__strncmpeq()'.
- 'wcscmpeq()'
- The '__wcscmpeq()' function shall compare the wide-character
string pointed to by 'ws1' to the wide-character string pointed
to by 'ws2'. If the two wide-character strings are equal the
return value will be zero. Otherwise the return value will be
some non-zero value. 'wcscmp()' is a valid implementation of
'__wcscmpeq()'.
- 'wcsncmpeq()'
- The '__wcsncmpeq()' function shall compare not more than 'n'
wide-character codes (wide-character codes that follow a null
wide-character code are not compared) from the array pointed to
by 'ws1' to the array pointed to by 'ws2'. If the two
wide-character strings or first 'n' characters are equal the
return value will be zero. Otherwise the return value will be
some non-zero value. 'wcsncmp()' is a valid implementation of
'__wcsncmpeq()'.
#### Use Case ####
The goal is that the new interfaces will be usable as an optimization
by compilers if a program uses the return value of the non "eq"
variant as a boolean. For example:
void
foo (const char *s1, const char *s2, const wchar_t *ws1, const wchar_t *ws2,
size_t n)
{
if (!strcmp (s1, s2))
printf ("strcmp can be optimized to __strcmpeq in this use case\n");
if (strncmp (s1, s2, n))
printf ("strncmp can be optimized to __strncmpeq in this use case\n");
if (wcscmp (ws1, ws2))
printf ("wcscmp can be optimized to __wcscmpeq in this use case\n");
if (!wcsncmp (ws1, ws2, n))
printf ("wcsncmp can be optimized to __wcsncmpeq in this use case\n");
}
#### Argument Specifications ####
- '__strcmpeq()' has the exact same argument specifications as 'strcmp()'
- 's1' is a null terminated character string.
- 's2' is a null terminated character string.
- '__strncmpeq()' has the exact same argument specifications as 'strncmp()'
- 's1' is a character sequences terminated either by null or 'n'
- 's2' is a character sequences terminated either by null or 'n'
- 'n' is the maximum number
- '__wcscmpeq()' has the exact same argument specifications as 'wcscmp()'
- 'ws1' is a null terminated wide-character string.
- 'ws2' is a null terminated wide-character string.
- '__wcsncmpeq()' has the exact same argument specifications as 'wcsncmp()'
- 'ws1' is a wide-character sequences terminated either by null or 'n'
- 'ws2' is a wide-character sequences terminated either by null or 'n'
- 'n' is the maximum number
For each of these functions, if any of the input constraints are not
met, the result is undefined.
#### Return Value Specification ####
- '__strcmpeq()'
- if 's1' and 's2' are equal, the return value is zero. Otherwise
the return value is any non-zero value. 'strcmp()',
'!!strcmp()', or '-strcmp()' are all valid implementations of
'__strcmpeq()'.
- '__strncmpeq()'
- if 's1' and 's2' are equal up to the first 'n' characters or up
to and including the first null character, the return value is
zero. Otherwise the return value is any non-zero
value. 'strncmp()', '!!strncmp()', or '-strncmp()' are all valid
implementations of '__strncmpeq()'.
- '__wcscmpeq()'
- if 'ws1' and 'ws2' are equal, the return value is
zero. Otherwise the return value is any non-zero
value. 'wcscmp()', '!!wcscmp()', or '-wcscmp()' are all valid
implementations of '__wcscmpeq()'.
- '__wcsncmpeq()'
- if 'ws1' and 'ws2' are equal up to the first 'n' wide-characters
or up to and including the first null wide-character, the return
value is zero. Otherwise the return value is any non-zero
value. 'wcsncmp()', '!!wcsncmp()', or '-wcsncmp()' are all valid
implementations of '__wcsncmpeq()'.
#### Notes ####
These interfaces are designed intentionally so that the non "eq"
variant of each function will be a valid implementation of the
corresponding "eq" variant.
#### ABI vs API ####
This proposal is for '__strcmpeq()', '__strncmpeq()', '__wcscmpeq()',
and '__wcsncmpeq()' as new ABIs. As ABIs the interfaces will have
value as an optimization compilers can make for the idiomatic boolean
usage of the return value of the existing comparison functions.
Best,
Noah
next reply other threads:[~2022-01-20 22:57 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-20 22:56 Noah Goldstein [this message]
2022-01-21 18:51 ` [libc-coord] " Joerg Sonnenberger
2022-01-21 21:50 ` Noah Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFUsyfKOtCxcM84LAPwmh=h+Kkz7x-CvGvsBFUW-wY1Zjv00eg@mail.gmail.com' \
--to=goldstein.w.n@gmail.com \
--cc=gcc@gcc.gnu.org \
--cc=libc-alpha@sourceware.org \
--cc=libc-coord@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).