From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by sourceware.org (Postfix) with ESMTPS id 620AA385840C for ; Thu, 16 Sep 2021 17:02:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 620AA385840C Received: by mail-pl1-x630.google.com with SMTP id 5so4285361plo.5 for ; Thu, 16 Sep 2021 10:02:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=2jS2gBGmD6EcsWlU+OSvcqAF5QZHpinQTPh7VJNulG8=; b=VNoTdMb7AFFM95rcb4x8M322hH53F1E8VojUrFTKb1velUDwzIpvTkFlxmwCqxbP6w m1r2pixxYtT0H/t/qfshNH06viLSgJSnEXtKPr0oWbG+OSCCU3e87v8Vgp8gjwkPCThk sDYsmyMQl5iS4IfnNL8csg6pvb8pdeKR5q3jmMcl5wHv7J8Y7dI0UtDUIaXvgSLqaU6D kDYMl9FG+m1iJkIWllos1JRk0QhpPkLl2jcjNvng6aRbmS6T0cbzcsYMikiJHnKA7rqm Qa46egtsO48jYGm97ix6YOE0jtqZZIOLfQSXvZJfuhoZ7HFXCmCUExh1cuydeBq5X88M P5Dw== X-Gm-Message-State: AOAM530FmOLVQO33k1LlIvqHH/ju4OjfyXRWdOcObgW7SX36yIEog+df gXv6HzUAkU9AawgDuF0FcxbXd0qq+w8Q4LBh68Yn+YUEwkPrUw== X-Google-Smtp-Source: ABdhPJxyxa2zwxH0c0DsjcavxprFx0OmE04zBz+dyrcx58s9yGciT0LsN5W19eufhidNJzL4BiLReUKERChAUXezOt4= X-Received: by 2002:a17:902:8d8b:b0:138:e09d:d901 with SMTP id v11-20020a1709028d8b00b00138e09dd901mr5791394plo.34.1631811733401; Thu, 16 Sep 2021 10:02:13 -0700 (PDT) MIME-Version: 1.0 From: Noah Goldstein Date: Thu, 16 Sep 2021 12:02:02 -0500 Message-ID: Subject: Add new ABI '__memcmpeq()' to libc To: libc-coord@lists.openwall.com Cc: GNU C Library , gcc@gcc.gnu.org X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=unavailable autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Sep 2021 17:02:16 -0000 Hi All, This is a proposal for a new interface to be supported by libc. The new interface is the same as the old 'bcmp()' routine. Essentially the goal of this proposal is to add a reserved namespace for a new function, '__memcmpeq()', which shares the same behavior as the old 'bcmp()'. #### Interface #### int __memcmpeq(void const * s1, const void * s2, size_t n) #### Description #### The '__memcmpeq()' function would compare the two byte sequences 's1' and 's2', each of length 'n'. If the two byte sequences are equal, the return would be zero. Otherwise it would return some non-zero value. 'memcmp()' is a valid implementation of '__memcmpeq()'. #### Use Case #### 1. The goal is that '__memcmpeq()' will be usable as an optimization by compilers if a program uses the return value of 'memcmp()' as a boolean. For example: void foo(const void* s1, const void* s2, size_t n) { if (!memcmp(s1, s2, n)) { printf("memcmp can be optimized to __memcmpeq in this use case\n"); } } - In the above case '__memcmpeq()' could be used instead. Due to the simpler constraints on the return value of '__memcmpeq()', it will be able to be implemented more optimally for this case than 'memcmp()'. If there is no separately optimized version of '__memcmpeq()' can alias 'memcmp()' and thus be at least equally as fast. 2. Possibly use cases in security as the runtime of the function will be *more* oblivious to the byte sequences being compared. #### Argument Specifications #### 1. 's1' - All 'n' bytes in the byte sequence starting at 's1' and ending at, but not including, 's1 + n' must be accessible memory. There are no guarantees about the order the sequence will be traversed. 2. 's2' - All 'n' bytes in the byte sequence starting at 's2' and ending at, but not including, 's2 + n' must be accessible memory. There are no guarantees about the order the sequence will be traversed. 3. 'n' - 'n' may be any value that does not violate the specifications on 's1' and 's2'. If any of the argument specifications are violated there are no guarantees about the behavior of the interface. #### Return Value Specification #### If the byte sequences starting at 's1' and 's2' are equals the function will return zero. Otherwise the function will return a non-zero value. Equality between the byte sequences starting at 's1' and 's2' is defined as follows: 1. If 'n' is zero the two sequences are zero. 2. If 'n' is non-zero then for all 'i' in range [0, n) the byte at offset 'i' of 's1' equals the byte at offset 'i' in 's2'. For a simple C implementation of '__memcmpeq()' could be as follows: int __memcmpeq(const void* s1, const void* s2, size_t n) { int ret; size_t i; const char *s1c, *s2c; s1c = (const char*)s1; s2c = (const char*)s2; for (i = 0, ret = 0; ret == 0 && i < n; ++i) { ret = s1c[i] - s2c[i] } return ret; } #### Notes #### This interface is essentially old 'bcmp()' and 'memcmp()' will always be a valid implementation of '__memcmpeq()'. #### ABI vs API #### This proposal is for '__memcmpeq()' as a new ABI. As an ABI '__memcmpeq()' will have value, as using the return value of 'memcmp()' is quite idiomatic in C code. It is, however, possible that this would also be useful as a new API as well. Especially if there are likely use cases where the compiler would be unable to prove that '__memcmpeq()' would be a valid replacement for 'memcmp()'. #### Further Options #### If this proposal is received positively, libc could also add interfaces for '__streq()', '__strneq()', '__wcseq()' and '__wcsneq()' which similarly would loosen return value restrictions on 'strcmp()', 'strncmp()', 'wcscmp()' and 'wcsncmp()' respectively. Best, Noah