From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x332.google.com (mail-ot1-x332.google.com [IPv6:2607:f8b0:4864:20::332]) by sourceware.org (Postfix) with ESMTPS id 6863E3858410 for ; Fri, 13 Aug 2021 21:30:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6863E3858410 Received: by mail-ot1-x332.google.com with SMTP id 108-20020a9d01750000b029050e5cc11ae3so13665060otu.5 for ; Fri, 13 Aug 2021 14:30:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=NVvTp3ah3Fi98uH7re43Se71aiit9rI6bGxDSJ15s0w=; b=Bt6z8bQWhC8g7ivhyiCcK6j2fniKnBQS2HkH6tK35tajkQdB+X+C78l3lIMe5/lyWd adLYRXYDhIKS0oPE+6mrMLxybysCo5iTj5GDkuegUD57SFyXVwXXAVy6xSots+TKkgIx yJDNOkD0B/y/WT2kK+bc7JxdcVPrFjpn7E4jUmK948FNVLSBWWVYl5dxXv6puU8rF09Y X/l56w1EwA42I4p8205a5x4spml209HI/F3QeLjHpFgbsHy/BH0nNPwJeLQHDOCg7NAb QhvRjdM043ZCNI6sPCAqELyGtvSlYC6EI3ILpPkoG8e1Wge9LSEikFCnLss2bo8vyTQe t/8w== X-Gm-Message-State: AOAM532k1IyR/Xpd26Fd+gBPeyMjB9AsglUPYEs7IgCwjUIz1ao7RLM3 fCVxEwvDsis0vB50tZb5BBkkehvDHsw= X-Google-Smtp-Source: ABdhPJxxUxvzeYD2bmLBQpqS8aGt3IpqtWwQTXXjUXPV2+2OKHdJgLsguHZo7OYjCeYgsFBwIeIdcQ== X-Received: by 2002:a9d:1286:: with SMTP id g6mr3738133otg.282.1628890203522; Fri, 13 Aug 2021 14:30:03 -0700 (PDT) Received: from [192.168.0.41] (97-118-104-61.hlrn.qwest.net. [97.118.104.61]) by smtp.gmail.com with ESMTPSA id w8sm558597otq.36.2021.08.13.14.30.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 13 Aug 2021 14:30:03 -0700 (PDT) Subject: Re: [PATCH] remove attribute access from regexec To: Paul Eggert Cc: GNU C Library References: <15a32181-8060-4135-cb72-9e79831697d5@gmail.com> From: Martin Sebor Message-ID: Date: Fri, 13 Aug 2021 15:30:02 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------C6D2C59064759DC02F5C3B00" Content-Language: en-US X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Aug 2021 21:30:05 -0000 This is a multi-part message in MIME format. --------------C6D2C59064759DC02F5C3B00 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 8/13/21 2:11 PM, Paul Eggert wrote: > On 8/13/21 11:26 AM, Martin Sebor via Libc-alpha wrote: >> -    __attr_access ((__write_only__, 4, 3)); >> +    __attr_access ((__read_write__, 4, 3)); > > Wouldn't it be simpler to remove the __attr_access instead? > > What utility does __attr_access ((__read_write__, 4, 3)) have? FWIW, > Glibc currently does not use such an attribute anywhere. The attribute serves two purposes: it specifies 1) how the function might access the array (to determine whether it should be initialized by the caller) and 2) the minimum number of elements the caller must provide and the maximum number of elements the function definition might access. GCC checks to make sure these constraints are satisfied, both at call sites and in the definitions of these functions. The read_write mode means that a function may both read from and write to the array, and expects it to be initialized. But since regexec might just write to the array and not read from it, depending on flags, the read_write mode wouldn't be correct either. There is no attribute access mode that would capture this but in C99 (though sadly not in C++) and thanks to the nmatch argument preceding pmatch, the VLA argument syntax does make the checking possible. It's like attribute access with an unspecified mode. (It might be worth extending the attribute to express this mode so it can be used when the bound doesn't precede the VLA argument and in C++.) Attached is a revised patch with this approach. Martin PS POSIX says regexec() ignores pnmatch when REG_NOSUB is set, so strictly speaking, warning for such calls to it in that case is also a false positive. I think that's an acceptable trade-off for the buffer overflow detection (passing a zero nmatch in that case is a trivial way to avoid the warning). --------------C6D2C59064759DC02F5C3B00 Content-Type: text/x-patch; charset=UTF-8; name="glibc-28170.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="glibc-28170.diff" diff --git a/include/regex.h b/include/regex.h index 24eca2c297..d67ec65ad9 100644 --- a/include/regex.h +++ b/include/regex.h @@ -36,8 +36,15 @@ extern void __re_set_registers extern int __regcomp (regex_t *__preg, const char *__pattern, int __cflags); libc_hidden_proto (__regcomp) +#if 199901L <= __STDC_VERSION__ && !defined (__STDC_NO_VLA__) extern int __regexec (const regex_t *__preg, const char *__string, - size_t __nmatch, regmatch_t __pmatch[], int __eflags); + size_t __nmatch, regmatch_t __pmatch[__nmatch], + int __eflags); +#else +extern int __regexec (const regex_t *__preg, const char *__string, + size_t __nmatch, regmatch_t __pmatch[], + int __eflags); +#endif libc_hidden_proto (__regexec) extern size_t __regerror (int __errcode, const regex_t *__preg, diff --git a/posix/regex.h b/posix/regex.h index 14fb1d8364..ebacbaf95d 100644 --- a/posix/regex.h +++ b/posix/regex.h @@ -652,11 +652,17 @@ extern int regcomp (regex_t *_Restrict_ __preg, const char *_Restrict_ __pattern, int __cflags); +#if 199901L <= __STDC_VERSION__ && !defined (__STDC_NO_VLA__) extern int regexec (const regex_t *_Restrict_ __preg, const char *_Restrict_ __String, size_t __nmatch, - regmatch_t __pmatch[_Restrict_arr_], - int __eflags) - __attr_access ((__write_only__, 4, 3)); + regmatch_t __pmatch[_Restrict_arr_ __nmatch], + int __eflags); +#else +extern int regexec (const regex_t *_Restrict_ __preg, + const char *_Restrict_ __String, size_t __nmatch, + regmatch_t __pmatch[_Restrict_arr], + int __eflags); +#endif extern size_t regerror (int __errcode, const regex_t *_Restrict_ __preg, char *_Restrict_ __errbuf, size_t __errbuf_size) --------------C6D2C59064759DC02F5C3B00--