public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] remove attribute access from regexec
@ 2021-08-13 18:26 Martin Sebor
  2021-08-13 20:11 ` Paul Eggert
  0 siblings, 1 reply; 22+ messages in thread
From: Martin Sebor @ 2021-08-13 18:26 UTC (permalink / raw)
  To: GNU C Library

A recent GCC enhancement to detect accesses by functions declared
with attribute access that are in conflict with the mode specified
by the attribute triggers a warning for the definition of regexec:

regexec.c: In function ‘__regexec’:
regexec.c:204:13: warning: ‘*pmatch.rm_so’ may be used uninitialized 
[-Wmaybe-uninitialized]
regexec.c:192:101: note: accessing argument 4 of a function declared 
with attribute ‘access (write_only, 4, 3)’

The attribute was added based on the POSIX description of
the function that implies the pmatch array referenced by it is
write-only.  However, when the REG_STARTEND bit is set in flags,
Glibc also reads from the object.  This seems to be an extension
to POSIX that's not mentioned in the Glibc manual but that is
documented in the Linux man pages:

   https://man7.org/linux/man-pages/man3/regcomp.3.html

The patch below changes the attribute's mode to read_write to
reflect this extension.

Martin

diff --git a/posix/regex.h b/posix/regex.h
index 14fb1d8364..ba8351f873 100644
--- a/posix/regex.h
+++ b/posix/regex.h
@@ -656,7 +656,7 @@ extern int regexec (const regex_t *_Restrict_ __preg,
                     const char *_Restrict_ __String, size_t __nmatch,
                     regmatch_t __pmatch[_Restrict_arr_],
                     int __eflags)
-    __attr_access ((__write_only__, 4, 3));
+    __attr_access ((__read_write__, 4, 3));

  extern size_t regerror (int __errcode, const regex_t *_Restrict_ __preg,
                         char *_Restrict_ __errbuf, size_t __errbuf_size)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] remove attribute access from regexec
  2021-08-13 18:26 [PATCH] remove attribute access from regexec Martin Sebor
@ 2021-08-13 20:11 ` Paul Eggert
  2021-08-13 21:30   ` Martin Sebor
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2021-08-13 20:11 UTC (permalink / raw)
  To: Martin Sebor; +Cc: GNU C Library

On 8/13/21 11:26 AM, Martin Sebor via Libc-alpha wrote:
> -    __attr_access ((__write_only__, 4, 3));
> +    __attr_access ((__read_write__, 4, 3));

Wouldn't it be simpler to remove the __attr_access instead?

What utility does __attr_access ((__read_write__, 4, 3)) have? FWIW, 
Glibc currently does not use such an attribute anywhere.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] remove attribute access from regexec
  2021-08-13 20:11 ` Paul Eggert
@ 2021-08-13 21:30   ` Martin Sebor
  2021-08-13 22:34     ` Paul Eggert
  0 siblings, 1 reply; 22+ messages in thread
From: Martin Sebor @ 2021-08-13 21:30 UTC (permalink / raw)
  To: Paul Eggert; +Cc: GNU C Library

[-- Attachment #1: Type: text/plain, Size: 1830 bytes --]

On 8/13/21 2:11 PM, Paul Eggert wrote:
> On 8/13/21 11:26 AM, Martin Sebor via Libc-alpha wrote:
>> -    __attr_access ((__write_only__, 4, 3));
>> +    __attr_access ((__read_write__, 4, 3));
> 
> Wouldn't it be simpler to remove the __attr_access instead?
> 
> What utility does __attr_access ((__read_write__, 4, 3)) have? FWIW, 
> Glibc currently does not use such an attribute anywhere.

The attribute serves two purposes: it specifies 1) how the function
might access the array (to determine whether it should be initialized
by the caller) and 2) the minimum number of elements the caller must
provide and the maximum number of elements the function definition
might access.

GCC checks to make sure these constraints are satisfied, both at call
sites and in the definitions of these functions.

The read_write mode means that a function may both read from and write
to the array, and expects it to be initialized.  But since regexec
might just write to the array and not read from it, depending on
flags, the read_write mode wouldn't be correct either.

There is no attribute access mode that would capture this but in C99
(though sadly not in C++) and thanks to the nmatch argument preceding
pmatch, the VLA argument syntax does make the checking possible.  It's
like attribute access with an unspecified mode.  (It might be worth
extending the attribute to express this mode so it can be used when
the bound doesn't precede the VLA argument and in C++.)

Attached is a revised patch with this approach.

Martin

PS POSIX says regexec() ignores pnmatch when REG_NOSUB is set, so
strictly speaking, warning for such calls to it in that case is
also a false positive.  I think that's an acceptable trade-off for
the buffer overflow detection (passing a zero nmatch in that case
is a trivial way to avoid the warning).

[-- Attachment #2: glibc-28170.diff --]
[-- Type: text/x-patch, Size: 1753 bytes --]

diff --git a/include/regex.h b/include/regex.h
index 24eca2c297..d67ec65ad9 100644
--- a/include/regex.h
+++ b/include/regex.h
@@ -36,8 +36,15 @@ extern void __re_set_registers
 extern int __regcomp (regex_t *__preg, const char *__pattern, int __cflags);
 libc_hidden_proto (__regcomp)
 
+#if 199901L <= __STDC_VERSION__ && !defined (__STDC_NO_VLA__)
 extern int __regexec (const regex_t *__preg, const char *__string,
-		      size_t __nmatch, regmatch_t __pmatch[], int __eflags);
+		      size_t __nmatch, regmatch_t __pmatch[__nmatch],
+		      int __eflags);
+#else
+extern int __regexec (const regex_t *__preg, const char *__string,
+		      size_t __nmatch, regmatch_t __pmatch[],
+		      int __eflags);
+#endif
 libc_hidden_proto (__regexec)
 
 extern size_t __regerror (int __errcode, const regex_t *__preg,
diff --git a/posix/regex.h b/posix/regex.h
index 14fb1d8364..ebacbaf95d 100644
--- a/posix/regex.h
+++ b/posix/regex.h
@@ -652,11 +652,17 @@ extern int regcomp (regex_t *_Restrict_ __preg,
 		    const char *_Restrict_ __pattern,
 		    int __cflags);
 
+#if 199901L <= __STDC_VERSION__ && !defined (__STDC_NO_VLA__)
 extern int regexec (const regex_t *_Restrict_ __preg,
 		    const char *_Restrict_ __String, size_t __nmatch,
-		    regmatch_t __pmatch[_Restrict_arr_],
-		    int __eflags)
-    __attr_access ((__write_only__, 4, 3));
+		    regmatch_t __pmatch[_Restrict_arr_ __nmatch],
+		    int __eflags);
+#else
+extern int regexec (const regex_t *_Restrict_ __preg,
+		    const char *_Restrict_ __String, size_t __nmatch,
+		    regmatch_t __pmatch[_Restrict_arr],
+		    int __eflags);
+#endif
 
 extern size_t regerror (int __errcode, const regex_t *_Restrict_ __preg,
 			char *_Restrict_ __errbuf, size_t __errbuf_size)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] remove attribute access from regexec
  2021-08-13 21:30   ` Martin Sebor
@ 2021-08-13 22:34     ` Paul Eggert
  2021-08-14 20:08       ` Martin Sebor
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2021-08-13 22:34 UTC (permalink / raw)
  To: Martin Sebor; +Cc: GNU C Library

On 8/13/21 2:30 PM, Martin Sebor wrote:
> Attached is a revised patch with this approach.

The revised patch is to include/regex.h but the original patch was to 
posix/regex.h. Is that intentional?

We need to check whether __STDC_VERSION__ is defined. Also, no need for 
parens around arg of 'defined'. Something like this perhaps:

   #if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
        && !defined __STDC_NO_VLA__)

Also, the duplication of the declarations make the headers harder to 
read and encourage typos (I noticed one typo: "_Restrict_arr" without 
the trailing "_"). Instead, I suggest something like this:

   #if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
        && !defined __STDC_NO_VLA__)
   # define _REGEX_VLA(arg) arg
   #else
   # define _REGEX_VLA(arg)
   #endif

That way, we can simply change "regmatch_t __pmatch[_Restrict_arr_]" to 
"regmatch_t __pmatch[_Restrict_arr_ _REGEX_VLA (__nmatch)]" without 
having to duplicate the entire function declaration.

> PS POSIX says regexec() ignores pnmatch when REG_NOSUB is set, so
> strictly speaking, warning for such calls to it in that case is
> also a false positive.

Ouch, this casts doubt on the entire exercise. It's not simply about 
warnings: it's about the code being generated for the matcher. For 
example, for:

int
f (_Bool flag, unsigned long n, int a[n])
{
   return n == 0 ? 0 : flag ? a[n - 1] : a[0];
}

a compiler is allowed to generate code that loads a[n - 1] even when 
FLAG is false. Similarly, if we add this VLA business to regexec, the 
generated machine code could dereference pmatch unconditionally even if 
our source code makes the dereferencing conditional on REG_NOSUB, and 
the resulting behavior would fail to conform to POSIX.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] remove attribute access from regexec
  2021-08-13 22:34     ` Paul Eggert
@ 2021-08-14 20:08       ` Martin Sebor
  2021-08-18 15:53         ` [PATCH v2] " Martin Sebor
  2021-08-18 19:52         ` [PATCH] " Paul Eggert
  0 siblings, 2 replies; 22+ messages in thread
From: Martin Sebor @ 2021-08-14 20:08 UTC (permalink / raw)
  To: Paul Eggert; +Cc: GNU C Library

[-- Attachment #1: Type: text/plain, Size: 2497 bytes --]

On 8/13/21 4:34 PM, Paul Eggert wrote:
> On 8/13/21 2:30 PM, Martin Sebor wrote:
>> Attached is a revised patch with this approach.
> 
> The revised patch is to include/regex.h but the original patch was to 
> posix/regex.h. Is that intentional?

Yes, they need to be consistent, otherwise GCC issues -Wvla-parameter.
(That's to help detect inadvertent array/VLA mismatches as well as
mismatches in the VLA parameter bounds.)

> 
> We need to check whether __STDC_VERSION__ is defined. Also, no need for 
> parens around arg of 'defined'. Something like this perhaps:
> 
>    #if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
>         && !defined __STDC_NO_VLA__)
> 
> Also, the duplication of the declarations make the headers harder to 
> read and encourage typos (I noticed one typo: "_Restrict_arr" without 
> the trailing "_"). Instead, I suggest something like this:
> 
>    #if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
>         && !defined __STDC_NO_VLA__)
>    # define _REGEX_VLA(arg) arg
>    #else
>    # define _REGEX_VLA(arg)
>    #endif
> 
> That way, we can simply change "regmatch_t __pmatch[_Restrict_arr_]" to 
> "regmatch_t __pmatch[_Restrict_arr_ _REGEX_VLA (__nmatch)]" without 
> having to duplicate the entire function declaration.

Sounds good.  I've defined the macro in cdefs.h and mamed it _VLA_ARG
to make it usable in other contexts.  Please see the attached revision.

> 
>> PS POSIX says regexec() ignores pnmatch when REG_NOSUB is set, so
>> strictly speaking, warning for such calls to it in that case is
>> also a false positive.
> 
> Ouch, this casts doubt on the entire exercise. It's not simply about 
> warnings: it's about the code being generated for the matcher. For 
> example, for:
> 
> int
> f (_Bool flag, unsigned long n, int a[n])
> {
>    return n == 0 ? 0 : flag ? a[n - 1] : a[0];
> }
> 
> a compiler is allowed to generate code that loads a[n - 1] even when 
> FLAG is false. Similarly, if we add this VLA business to regexec, the 
> generated machine code could dereference pmatch unconditionally even if 
> our source code makes the dereferencing conditional on REG_NOSUB, and 
> the resulting behavior would fail to conform to POSIX.

The VLA bound by itself doesn't affect codegen.  I suspect you're
thinking of a[static n]?  With just a[n], without static, there
is no requirement that a point to an array with n elements.  It
simply declares an ordinary pointer, same as [] or *.

Martin

[-- Attachment #2: glibc-28170.diff --]
[-- Type: text/x-patch, Size: 1835 bytes --]

diff --git a/include/regex.h b/include/regex.h
index 24eca2c297..5cf3ef7636 100644
--- a/include/regex.h
+++ b/include/regex.h
@@ -37,7 +37,8 @@ extern int __regcomp (regex_t *__preg, const char *__pattern, int __cflags);
 libc_hidden_proto (__regcomp)
 
 extern int __regexec (const regex_t *__preg, const char *__string,
-		      size_t __nmatch, regmatch_t __pmatch[], int __eflags);
+		      size_t __nmatch, regmatch_t __pmatch[_VLA_ARG (__nmatch)],
+		      int __eflags);
 libc_hidden_proto (__regexec)
 
 extern size_t __regerror (int __errcode, const regex_t *__preg,
diff --git a/misc/sys/cdefs.h b/misc/sys/cdefs.h
index e490fc1aeb..6b6e4c233c 100644
--- a/misc/sys/cdefs.h
+++ b/misc/sys/cdefs.h
@@ -632,4 +632,14 @@ _Static_assert (0, "IEEE 128-bits long double requires redirection on this platf
 # define __attribute_returns_twice__ /* Ignore.  */
 #endif
 
+
+#if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__	\
+     && !defined __STDC_NO_VLA__)
+/* Used to specify a variable bound in a declaration of a function
+   VLA-like parameter, as in 'int f (int n, int[_VLA_ARG (n)n]);'  */
+# define _VLA_ARG(arg) arg
+#else
+# define _VLA_ARG(arg)
+#endif
+
 #endif	 /* sys/cdefs.h */
diff --git a/posix/regex.h b/posix/regex.h
index 14fb1d8364..52ccc4b577 100644
--- a/posix/regex.h
+++ b/posix/regex.h
@@ -654,9 +654,8 @@ extern int regcomp (regex_t *_Restrict_ __preg,
 
 extern int regexec (const regex_t *_Restrict_ __preg,
 		    const char *_Restrict_ __String, size_t __nmatch,
-		    regmatch_t __pmatch[_Restrict_arr_],
-		    int __eflags)
-    __attr_access ((__write_only__, 4, 3));
+		    regmatch_t __pmatch[_Restrict_arr_ _VLA_ARG (__nmatch)],
+		    int __eflags);
 
 extern size_t regerror (int __errcode, const regex_t *_Restrict_ __preg,
 			char *_Restrict_ __errbuf, size_t __errbuf_size)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2] remove attribute access from regexec
  2021-08-14 20:08       ` Martin Sebor
@ 2021-08-18 15:53         ` Martin Sebor
  2021-08-18 19:52         ` [PATCH] " Paul Eggert
  1 sibling, 0 replies; 22+ messages in thread
From: Martin Sebor @ 2021-08-18 15:53 UTC (permalink / raw)
  To: Paul Eggert; +Cc: GNU C Library, Carlos O'Donell

[-- Attachment #1: Type: text/plain, Size: 3113 bytes --]

Let me repost an updated patch as v2 to let the patch tester know
about the revision, and to also include the corresponding change
to regexec.c needed to avoid the -Wvla-parameter I mentioned.

Apparently the first patch failed to apply and the second one
wasn't picked up because the subject line hasn't changed.  Thanks
for letting me know, Carlos!

Here's a link to the Patchwork check:
http://patchwork.sourceware.org/project/glibc/patch/15a32181-8060-4135-cb72-9e79831697d5@gmail.com/

On 8/14/21 2:08 PM, Martin Sebor wrote:
> On 8/13/21 4:34 PM, Paul Eggert wrote:
>> On 8/13/21 2:30 PM, Martin Sebor wrote:
>>> Attached is a revised patch with this approach.
>>
>> The revised patch is to include/regex.h but the original patch was to 
>> posix/regex.h. Is that intentional?
> 
> Yes, they need to be consistent, otherwise GCC issues -Wvla-parameter.
> (That's to help detect inadvertent array/VLA mismatches as well as
> mismatches in the VLA parameter bounds.)
> 
>>
>> We need to check whether __STDC_VERSION__ is defined. Also, no need 
>> for parens around arg of 'defined'. Something like this perhaps:
>>
>>    #if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
>>         && !defined __STDC_NO_VLA__)
>>
>> Also, the duplication of the declarations make the headers harder to 
>> read and encourage typos (I noticed one typo: "_Restrict_arr" without 
>> the trailing "_"). Instead, I suggest something like this:
>>
>>    #if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
>>         && !defined __STDC_NO_VLA__)
>>    # define _REGEX_VLA(arg) arg
>>    #else
>>    # define _REGEX_VLA(arg)
>>    #endif
>>
>> That way, we can simply change "regmatch_t __pmatch[_Restrict_arr_]" 
>> to "regmatch_t __pmatch[_Restrict_arr_ _REGEX_VLA (__nmatch)]" without 
>> having to duplicate the entire function declaration.
> 
> Sounds good.  I've defined the macro in cdefs.h and mamed it _VLA_ARG
> to make it usable in other contexts.  Please see the attached revision.
> 
>>
>>> PS POSIX says regexec() ignores pnmatch when REG_NOSUB is set, so
>>> strictly speaking, warning for such calls to it in that case is
>>> also a false positive.
>>
>> Ouch, this casts doubt on the entire exercise. It's not simply about 
>> warnings: it's about the code being generated for the matcher. For 
>> example, for:
>>
>> int
>> f (_Bool flag, unsigned long n, int a[n])
>> {
>>    return n == 0 ? 0 : flag ? a[n - 1] : a[0];
>> }
>>
>> a compiler is allowed to generate code that loads a[n - 1] even when 
>> FLAG is false. Similarly, if we add this VLA business to regexec, the 
>> generated machine code could dereference pmatch unconditionally even 
>> if our source code makes the dereferencing conditional on REG_NOSUB, 
>> and the resulting behavior would fail to conform to POSIX.
> 
> The VLA bound by itself doesn't affect codegen.  I suspect you're
> thinking of a[static n]?  With just a[n], without static, there
> is no requirement that a point to an array with n elements.  It
> simply declares an ordinary pointer, same as [] or *.
> 
> Martin


[-- Attachment #2: glibc-28170.diff --]
[-- Type: text/x-patch, Size: 2305 bytes --]

diff --git a/include/regex.h b/include/regex.h
index 24eca2c297..5cf3ef7636 100644
--- a/include/regex.h
+++ b/include/regex.h
@@ -37,7 +37,8 @@ extern int __regcomp (regex_t *__preg, const char *__pattern, int __cflags);
 libc_hidden_proto (__regcomp)
 
 extern int __regexec (const regex_t *__preg, const char *__string,
-		      size_t __nmatch, regmatch_t __pmatch[], int __eflags);
+		      size_t __nmatch, regmatch_t __pmatch[_VLA_ARG (__nmatch)],
+		      int __eflags);
 libc_hidden_proto (__regexec)
 
 extern size_t __regerror (int __errcode, const regex_t *__preg,
diff --git a/misc/sys/cdefs.h b/misc/sys/cdefs.h
index e490fc1aeb..6b6e4c233c 100644
--- a/misc/sys/cdefs.h
+++ b/misc/sys/cdefs.h
@@ -632,4 +632,14 @@ _Static_assert (0, "IEEE 128-bits long double requires redirection on this platf
 # define __attribute_returns_twice__ /* Ignore.  */
 #endif
 
+
+#if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__	\
+     && !defined __STDC_NO_VLA__)
+/* Used to specify a variable bound in a declaration of a function
+   VLA-like parameter, as in 'int f (int n, int[_VLA_ARG (n)n]);'  */
+# define _VLA_ARG(arg) arg
+#else
+# define _VLA_ARG(arg)
+#endif
+
 #endif	 /* sys/cdefs.h */
diff --git a/posix/regex.h b/posix/regex.h
index 14fb1d8364..52ccc4b577 100644
--- a/posix/regex.h
+++ b/posix/regex.h
@@ -654,9 +654,8 @@ extern int regcomp (regex_t *_Restrict_ __preg,
 
 extern int regexec (const regex_t *_Restrict_ __preg,
 		    const char *_Restrict_ __String, size_t __nmatch,
-		    regmatch_t __pmatch[_Restrict_arr_],
-		    int __eflags)
-    __attr_access ((__write_only__, 4, 3));
+		    regmatch_t __pmatch[_Restrict_arr_ _VLA_ARG (__nmatch)],
+		    int __eflags);
 
 extern size_t regerror (int __errcode, const regex_t *_Restrict_ __preg,
 			char *_Restrict_ __errbuf, size_t __errbuf_size)
diff --git a/posix/regexec.c b/posix/regexec.c
index f7b4f9cfc3..7b9d0ee65f 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -190,7 +190,7 @@ static reg_errcode_t extend_buffers (re_match_context_t *mctx, int min_len);
 
 int
 regexec (const regex_t *__restrict preg, const char *__restrict string,
-	 size_t nmatch, regmatch_t pmatch[], int eflags)
+	 size_t nmatch, regmatch_t pmatch[_VLA_ARG (nmatch)], int eflags)
 {
   reg_errcode_t err;
   Idx start, length;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] remove attribute access from regexec
  2021-08-14 20:08       ` Martin Sebor
  2021-08-18 15:53         ` [PATCH v2] " Martin Sebor
@ 2021-08-18 19:52         ` Paul Eggert
  2021-08-19 23:50           ` [PATCH v3] " Martin Sebor
  1 sibling, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2021-08-18 19:52 UTC (permalink / raw)
  To: Martin Sebor; +Cc: GNU C Library

On 8/14/21 1:08 PM, Martin Sebor wrote:
> The VLA bound by itself doesn't affect codegen.  I suspect you're
> thinking of a[static n]?  With just a[n], without static, there
> is no requirement that a point to an array with n elements.  It
> simply declares an ordinary pointer, same as [] or *.

Thanks for clarifying.

I tried using a patch like that on coreutils, but it caused the build to 
fail like this:

   In file included from lib/exclude.c:35:
   ./lib/regex.h:661:7: error: ISO C90 forbids variable length array 
'__pmatch' [-Werror=vla]
     661 |       regmatch_t __pmatch[_Restrict_arr_ _VLA_ARG (__nmatch)],
         |       ^~~~~~~~~~
   cc1: all warnings being treated as errors
   make[2]: *** [Makefile:10648: lib/exclude.o] Error 1

This is because coreutils is compiled with -Wvla -Werror, to catch 
inadvertent attempts to use VLAs in local variables (this helps avoid 
stack-overflow problems). It'd be unfortunate if we had to give that 
useful diagnostic up simply due to this declaration, as there's no 
stack-overflow problem here.

If you can think of a way around this issue, here are some other things 
I ran into while trying this idea out on Coreutils.

* Other cdefs.h macros (__NTH, __REDIRECT, etc.) start with two 
underscores, so shouldn't this new macro too?

* Come to think of it, the name _VLA_ARG could be improved, as its 
argument is not actually a VLA; it's the number of elements in a 
VLA-like array. Also, its formal-parameter "arg" is confusingly-named, 
as it's an arbitrary integer expression and need not be a function 
parameter name. How about naming the macro __ARG_NELTS instead?

* regex.h cannot use __ARG_NELTS directly, for the same reason it can't 
use __restrict_arr directly: regex.h is shared with Gnulib and can't 
assume that a glibc-like sys/cdefs.h is present. I suppose regex.h would 
need something like this:

   #ifndef _ARG_NELTS_
   # ifdef __ARG_NELTS
   #  define _ARG_NELTS_(arg) __ARG_NELTS (arg)
   # elif (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
	  && !defined __STDC_NO_VLA__)
   #  define _ARG_NELTS_(n) n
   # else
   #  define _ARG_NELTS_(n)
   # endif
   #endif

and then use _ARG_NELTS_ later.

* The cdefs.h comment has a stray 'n', its wording could be improved (I 
misread "variable bound" as a variable that's bound to something), 
there's a stray empty line, and it's nicer to put the comment in front 
of all the lines that define the macro. Perhaps something like this:

   /* Specify the number of elements of a function's array parameter,
      as in 'int f (int n, int a[__ARG_NELTS (n)]);'.  */
   #if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
        && !defined __STDC_NO_VLA__)
   # define __ARG_NELTS(n) n
   #else
   # define __ARG_NELTS(n)
   #endif

Though again, it's not clear to me that this idea will fly at all, due 
to the -Wvla issue.

Maybe GCC's -Wvla should be fixed to not report an error in this case? 
It's actually not a VLA if you ask me (the C standard is unclear).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3] remove attribute access from regexec
  2021-08-18 19:52         ` [PATCH] " Paul Eggert
@ 2021-08-19 23:50           ` Martin Sebor
  2021-08-22  5:06             ` Paul Eggert
  2021-10-19 16:39             ` Carlos O'Donell
  0 siblings, 2 replies; 22+ messages in thread
From: Martin Sebor @ 2021-08-19 23:50 UTC (permalink / raw)
  To: Paul Eggert; +Cc: GNU C Library

[-- Attachment #1: Type: text/plain, Size: 4283 bytes --]

On 8/18/21 1:52 PM, Paul Eggert wrote:
> On 8/14/21 1:08 PM, Martin Sebor wrote:
>> The VLA bound by itself doesn't affect codegen.  I suspect you're
>> thinking of a[static n]?  With just a[n], without static, there
>> is no requirement that a point to an array with n elements.  It
>> simply declares an ordinary pointer, same as [] or *.
> 
> Thanks for clarifying.
> 
> I tried using a patch like that on coreutils, but it caused the build to 
> fail like this:
> 
>    In file included from lib/exclude.c:35:
>    ./lib/regex.h:661:7: error: ISO C90 forbids variable length array 
> '__pmatch' [-Werror=vla]
>      661 |       regmatch_t __pmatch[_Restrict_arr_ _VLA_ARG (__nmatch)],
>          |       ^~~~~~~~~~
>    cc1: all warnings being treated as errors
>    make[2]: *** [Makefile:10648: lib/exclude.o] Error 1
> 
> This is because coreutils is compiled with -Wvla -Werror, to catch 
> inadvertent attempts to use VLAs in local variables (this helps avoid 
> stack-overflow problems). It'd be unfortunate if we had to give that 
> useful diagnostic up simply due to this declaration, as there's no 
> stack-overflow problem here.
> 
> If you can think of a way around this issue, here are some other things 
> I ran into while trying this idea out on Coreutils.

Thanks the for the additional testing!  I wouldn't expect to see
-Wvla for a Glibc declaration outside of a Glibc build.   As
a lexical warning, -Wvla shouldn't (and in my tests doesn't) trigger
for code in system headers unless it's enabled by -Wsystem-headers.
Is <regex.h> for some reason not considered a system header in your
test environment?

> 
> * Other cdefs.h macros (__NTH, __REDIRECT, etc.) start with two 
> underscores, so shouldn't this new macro too?

They're both reserved but I'm happy to go with whatever convention
is preferred in Glibc.

> 
> * Come to think of it, the name _VLA_ARG could be improved, as its 
> argument is not actually a VLA; it's the number of elements in a 
> VLA-like array. Also, its formal-parameter "arg" is confusingly-named, 
> as it's an arbitrary integer expression and need not be a function 
> parameter name. How about naming the macro __ARG_NELTS instead?

That works for me.

> 
> * regex.h cannot use __ARG_NELTS directly, for the same reason it can't 
> use __restrict_arr directly: regex.h is shared with Gnulib and can't 
> assume that a glibc-like sys/cdefs.h is present. I suppose regex.h would 
> need something like this:
> 
>    #ifndef _ARG_NELTS_
>    # ifdef __ARG_NELTS
>    #  define _ARG_NELTS_(arg) __ARG_NELTS (arg)
>    # elif (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
>        && !defined __STDC_NO_VLA__)
>    #  define _ARG_NELTS_(n) n
>    # else
>    #  define _ARG_NELTS_(n)
>    # endif
>    #endif
> 
> and then use _ARG_NELTS_ later.

I didn't know mixing and matching two implementations like this
was even possible.  Thanks for explaining it (though it seems
like a pretty cumbersome arrangement).  I've made the suggested
change.

> 
> * The cdefs.h comment has a stray 'n', its wording could be improved (I 
> misread "variable bound" as a variable that's bound to something), 
> there's a stray empty line, and it's nicer to put the comment in front 
> of all the lines that define the macro. Perhaps something like this:
> 
>    /* Specify the number of elements of a function's array parameter,
>       as in 'int f (int n, int a[__ARG_NELTS (n)]);'.  */
>    #if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
>         && !defined __STDC_NO_VLA__)
>    # define __ARG_NELTS(n) n
>    #else
>    # define __ARG_NELTS(n)
>    #endif

I've changed the macro to the above.

> 
> Though again, it's not clear to me that this idea will fly at all, due 
> to the -Wvla issue.
> 
> Maybe GCC's -Wvla should be fixed to not report an error in this case? 
> It's actually not a VLA if you ask me (the C standard is unclear).

I agree.  Someone else made the same suggestion in GCC bug 98217 (and
I even offered to handle it).  I'll try to remember to get to it but
as I said above, I don't think it should be necessary for this change.

Attached is yet another revision of this patch (v3 to let the patch
tester pick it up).

Martin

[-- Attachment #2: glibc-28170.diff --]
[-- Type: text/x-patch, Size: 2828 bytes --]

diff --git a/include/regex.h b/include/regex.h
index 24eca2c297..76fa798861 100644
--- a/include/regex.h
+++ b/include/regex.h
@@ -36,8 +36,24 @@ extern void __re_set_registers
 extern int __regcomp (regex_t *__preg, const char *__pattern, int __cflags);
 libc_hidden_proto (__regcomp)
 
+
+#ifndef __ARG_NELTS
+# ifdef __ARG_NELTS
+/* Same as the corresponding cdefs.h macro.  Defined for builds with
+   no cdefs.h.  */
+#  define __ARG_NELTS(arg) __ARG_NELTS (arg)
+# elif (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
+        && !defined __STDC_NO_VLA__)
+#  define __ARG_NELTS(n) n
+# else
+#  define __ARG_NELTS(n)
+# endif
+#endif
+
 extern int __regexec (const regex_t *__preg, const char *__string,
-		      size_t __nmatch, regmatch_t __pmatch[], int __eflags);
+		      size_t __nmatch,
+		      regmatch_t __pmatch[__ARG_NELTS (__nmatch)],
+		      int __eflags);
 libc_hidden_proto (__regexec)
 
 extern size_t __regerror (int __errcode, const regex_t *__preg,
diff --git a/misc/sys/cdefs.h b/misc/sys/cdefs.h
index e490fc1aeb..64e46df190 100644
--- a/misc/sys/cdefs.h
+++ b/misc/sys/cdefs.h
@@ -632,4 +632,17 @@ _Static_assert (0, "IEEE 128-bits long double requires redirection on this platf
 # define __attribute_returns_twice__ /* Ignore.  */
 #endif
 
+#ifndef __ARG_NELTS
+# ifdef __ARG_NELTS
+/* Used to specify a variable bound in a declaration of a function
+   VLA-like parameter, as in 'int f (int n, int[__ARG_NELTS (n)]);'  */
+#  define __ARG_NELTS(arg) __ARG_NELTS (arg)
+# elif (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
+        && !defined __STDC_NO_VLA__)
+#  define __ARG_NELTS(n) n
+# else
+#  define __ARG_NELTS(n)
+# endif
+#endif
+
 #endif	 /* sys/cdefs.h */
diff --git a/posix/regex.h b/posix/regex.h
index 14fb1d8364..5b44f8e52b 100644
--- a/posix/regex.h
+++ b/posix/regex.h
@@ -654,9 +654,8 @@ extern int regcomp (regex_t *_Restrict_ __preg,
 
 extern int regexec (const regex_t *_Restrict_ __preg,
 		    const char *_Restrict_ __String, size_t __nmatch,
-		    regmatch_t __pmatch[_Restrict_arr_],
-		    int __eflags)
-    __attr_access ((__write_only__, 4, 3));
+		    regmatch_t __pmatch[_Restrict_arr_ __ARG_NELTS (__nmatch)],
+		    int __eflags);
 
 extern size_t regerror (int __errcode, const regex_t *_Restrict_ __preg,
 			char *_Restrict_ __errbuf, size_t __errbuf_size)
diff --git a/posix/regexec.c b/posix/regexec.c
index f7b4f9cfc3..bec2fdfe39 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -190,7 +190,7 @@ static reg_errcode_t extend_buffers (re_match_context_t *mctx, int min_len);
 
 int
 regexec (const regex_t *__restrict preg, const char *__restrict string,
-	 size_t nmatch, regmatch_t pmatch[], int eflags)
+	 size_t nmatch, regmatch_t pmatch[__ARG_NELTS (nmatch)], int eflags)
 {
   reg_errcode_t err;
   Idx start, length;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-08-19 23:50           ` [PATCH v3] " Martin Sebor
@ 2021-08-22  5:06             ` Paul Eggert
  2021-08-26 15:06               ` Martin Sebor
  2021-10-19 16:39             ` Carlos O'Donell
  1 sibling, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2021-08-22  5:06 UTC (permalink / raw)
  To: Martin Sebor; +Cc: GNU C Library

[-- Attachment #1: Type: text/plain, Size: 2096 bytes --]

On 8/19/21 4:50 PM, Martin Sebor wrote:

> Is <regex.h> for some reason not considered a system header in your
> test environment?

Yes, because Coreutils etc. currently use Gnulib regex.h instead of 
/usr/include/regex.h, as Glibc regex has a bug and the fix (in Gnulib) 
hasn't made its way back to Glibc yet.

This sort of thing is all too common, as regex development (such as it 
is) often occurs in Gnulib first, which means it's common for 
'configure' to discover that the current Glibc has a bug and to 
substitute Gnulib regex instead. And when we compile Gnulib headers we 
don't consider them to be system headers, because we want the extra 
static checking that gcc does for non system headers.

I think the simplest workaround for the warning is to disable -Wvla for 
that particular declaration. Please see attached Gnulib patch for how I 
suggest doing this.

> I didn't know mixing and matching two implementations like this
> was even possible.  Thanks for explaining it (though it seems
> like a pretty cumbersome arrangement).

Yes, it is a bit awkward. I am tempted to clean it up but that really 
should be a separate patch and so I will focus only on this thread's issue.

> +#ifndef __ARG_NELTS
> +# ifdef __ARG_NELTS

A clear typo. The suggestion was to define and use _ARG_NELTS_ instead.

Some other thoughts.

If we're going to change regexec, we should change __compat_regex to be 
consistent.

We should also change regexec.c's internal functions to be consistent 
with regexec. This includes re_search_internal, update_regs, etc.

These internal functions don't have regexec's quirk that pmatch is 
ignored when not substituting, so their parameters can use the syntax 
"regmatch_t pmatch[__ARG_NELTS (static nmatch)]" to let the compiler 
know that the array must be of the given size.

Proposed Gnulib patch attached. It should be easy to aplly to glibc 
though glibc also needs its own regex.h patched. I have not installed 
this into Gnulib yet.

At some point we should sync Gnulib and Glibc regex of course.

[-- Attachment #2: 0001-regex-use-C99-style-array-arg-syntax.patch --]
[-- Type: text/x-patch, Size: 12543 bytes --]

From 284581e11190ba13ff5a6f1dfe14d292866f1dc5 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sat, 21 Aug 2021 21:58:16 -0700
Subject: [PATCH] regex: use C99-style array arg syntax

This should help with some static checking.
Derived from a suggestion by Martin Sebor in:
https://sourceware.org/pipermail/libc-alpha/2021-August/130336.html
* lib/cdefs.h (__ARG_NELTS): New macro.
* lib/regex.c: Ignore -Wvla for the whole file.
* lib/regex.h (_ARG_NELTS_): New macro.
Ignore -Wvla when declaring regexec.
* lib/regexec.c (regexec, __compat_regexec, re_copy_regs)
(re_search_internal, proceed_next_node, push_fail_stack)
(pop_fail_stack, set_regs, update_regs):
Use __ARG_NELTS for each array parameter whose size is another arg.
---
 ChangeLog     | 15 ++++++++++
 lib/cdefs.h   |  9 ++++++
 lib/regex.c   |  1 +
 lib/regex.h   | 21 +++++++++++++-
 lib/regexec.c | 78 +++++++++++++++++++++++++++++----------------------
 5 files changed, 89 insertions(+), 35 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 732b6f1fff..346e17d305 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,18 @@
+2021-08-21  Paul Eggert  <eggert@cs.ucla.edu>
+
+	regex: use C99-style array arg syntax
+	This should help with some static checking.
+	Derived from a suggestion by Martin Sebor in:
+	https://sourceware.org/pipermail/libc-alpha/2021-August/130336.html
+	* lib/cdefs.h (__ARG_NELTS): New macro.
+	* lib/regex.c: Ignore -Wvla for the whole file.
+	* lib/regex.h (_ARG_NELTS_): New macro.
+	Ignore -Wvla when declaring regexec.
+	* lib/regexec.c (regexec, __compat_regexec, re_copy_regs)
+	(re_search_internal, proceed_next_node, push_fail_stack)
+	(pop_fail_stack, set_regs, update_regs):
+	Use __ARG_NELTS for each array parameter whose size is another arg.
+
 2021-08-21  Bruno Haible  <bruno@clisp.org>
 
 	c-stack: Test for libsigsegv once, not twice.
diff --git a/lib/cdefs.h b/lib/cdefs.h
index 4dac9d264d..13c5542bfd 100644
--- a/lib/cdefs.h
+++ b/lib/cdefs.h
@@ -634,4 +634,13 @@ _Static_assert (0, "IEEE 128-bits long double requires redirection on this platf
 # define __attribute_returns_twice__ /* Ignore.  */
 #endif
 
+/* Specify the number of elements of a function's array parameter,
+   as in 'int f (int n, int a[__ARG_NELTS (n)]);'.  */
+#if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
+     && !defined __STDC_NO_VLA__)
+# define __ARG_NELTS(n) n
+#else
+# define __ARG_NELTS(n)
+#endif
+
 #endif	 /* sys/cdefs.h */
diff --git a/lib/regex.c b/lib/regex.c
index 7296be0f08..d32863972c 100644
--- a/lib/regex.c
+++ b/lib/regex.c
@@ -24,6 +24,7 @@
 
 # if __GNUC_PREREQ (4, 6)
 #  pragma GCC diagnostic ignored "-Wsuggest-attribute=pure"
+#  pragma GCC diagnostic ignored "-Wvla"
 # endif
 # if __GNUC_PREREQ (4, 3)
 #  pragma GCC diagnostic ignored "-Wold-style-definition"
diff --git a/lib/regex.h b/lib/regex.h
index 8e4ef45578..561af78298 100644
--- a/lib/regex.h
+++ b/lib/regex.h
@@ -640,6 +640,22 @@ extern int re_exec (const char *);
 # endif
 #endif
 
+#ifndef _ARG_NELTS_
+# ifdef __ARG_NELTS
+#  define _ARG_NELTS_(arg) __ARG_NELTS (arg)
+# elif (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
+	&& !defined __STDC_NO_VLA__)
+#  define _ARG_NELTS_(n) n
+# else
+#  define _ARG_NELTS_(n)
+# endif
+#endif
+
+#if defined __GNUC__ && 4 < __GNUC__ + (6 <= __GNUC_MINOR__)
+# pragma GCC diagnostic push
+# pragma GCC diagnostic ignored "-Wvla"
+#endif
+
 /* POSIX compatibility.  */
 extern int regcomp (regex_t *_Restrict_ __preg,
 		    const char *_Restrict_ __pattern,
@@ -647,7 +663,7 @@ extern int regcomp (regex_t *_Restrict_ __preg,
 
 extern int regexec (const regex_t *_Restrict_ __preg,
 		    const char *_Restrict_ __String, size_t __nmatch,
-		    regmatch_t __pmatch[_Restrict_arr_],
+		    regmatch_t __pmatch[_Restrict_arr_ _ARG_NELTS_ (__nmatch)],
 		    int __eflags);
 
 extern size_t regerror (int __errcode, const regex_t *_Restrict_ __preg,
@@ -655,6 +671,9 @@ extern size_t regerror (int __errcode, const regex_t *_Restrict_ __preg,
 
 extern void regfree (regex_t *__preg);
 
+#if defined __GNUC__ && 4 < __GNUC__ + (6 <= __GNUC_MINOR__)
+# pragma GCC diagnostic pop
+#endif
 
 #ifdef __cplusplus
 }
diff --git a/lib/regexec.c b/lib/regexec.c
index 5e4eb497a6..da1fb7fafa 100644
--- a/lib/regexec.c
+++ b/lib/regexec.c
@@ -31,11 +31,11 @@ static re_sub_match_last_t * match_ctx_add_sublast (re_sub_match_top_t *subtop,
 static void sift_ctx_init (re_sift_context_t *sctx, re_dfastate_t **sifted_sts,
 			   re_dfastate_t **limited_sts, Idx last_node,
 			   Idx last_str_idx);
-static reg_errcode_t re_search_internal (const regex_t *preg,
-					 const char *string, Idx length,
-					 Idx start, Idx last_start, Idx stop,
-					 size_t nmatch, regmatch_t pmatch[],
-					 int eflags);
+static reg_errcode_t
+re_search_internal (const regex_t *preg, const char *string, Idx length,
+		    Idx start, Idx last_start, Idx stop, size_t nmatch,
+		    regmatch_t pmatch[__ARG_NELTS (static nmatch)],
+		    int eflags);
 static regoff_t re_search_2_stub (struct re_pattern_buffer *bufp,
 				  const char *string1, Idx length1,
 				  const char *string2, Idx length2,
@@ -47,23 +47,29 @@ static regoff_t re_search_stub (struct re_pattern_buffer *bufp,
 				regoff_t range, Idx stop,
 				struct re_registers *regs,
 				bool ret_len);
-static unsigned re_copy_regs (struct re_registers *regs, regmatch_t *pmatch,
-                              Idx nregs, int regs_allocated);
+static unsigned re_copy_regs (struct re_registers *regs, Idx nregs,
+			      regmatch_t pmatch[__ARG_NELTS (static nregs)],
+			      int regs_allocated);
 static reg_errcode_t prune_impossible_nodes (re_match_context_t *mctx);
 static Idx check_matching (re_match_context_t *mctx, bool fl_longest_match,
 			   Idx *p_match_first);
 static Idx check_halt_state_context (const re_match_context_t *mctx,
 				     const re_dfastate_t *state, Idx idx);
-static void update_regs (const re_dfa_t *dfa, regmatch_t *pmatch,
-			 regmatch_t *prev_idx_match, Idx cur_node,
-			 Idx cur_idx, Idx nmatch);
-static reg_errcode_t push_fail_stack (struct re_fail_stack_t *fs,
-				      Idx str_idx, Idx dest_node, Idx nregs,
-				      regmatch_t *regs, regmatch_t *prevregs,
-				      re_node_set *eps_via_nodes);
+static void
+update_regs (const re_dfa_t *dfa, Idx nmatch,
+	     regmatch_t pmatch[__ARG_NELTS (static nmatch)],
+	     regmatch_t prev_idx_match[__ARG_NELTS (static nmatch)],
+	     Idx cur_node, Idx cur_idx);
+static reg_errcode_t
+push_fail_stack (struct re_fail_stack_t *fs,
+		 Idx str_idx, Idx dest_node, Idx nregs,
+		 regmatch_t regs[__ARG_NELTS (static nregs)],
+		 regmatch_t prevregs[__ARG_NELTS (static nregs)],
+		 re_node_set *eps_via_nodes);
 static reg_errcode_t set_regs (const regex_t *preg,
 			       const re_match_context_t *mctx,
-			       size_t nmatch, regmatch_t *pmatch,
+			       size_t nmatch,
+			       regmatch_t pmatch[__ARG_NELTS (static nmatch)],
 			       bool fl_backtrack);
 static reg_errcode_t free_fail_stack_return (struct re_fail_stack_t *fs);
 
@@ -191,7 +197,7 @@ static reg_errcode_t extend_buffers (re_match_context_t *mctx, int min_len);
 
 int
 regexec (const regex_t *__restrict preg, const char *__restrict string,
-	 size_t nmatch, regmatch_t pmatch[], int eflags)
+	 size_t nmatch, regmatch_t pmatch[__ARG_NELTS (nmatch)], int eflags)
 {
   reg_errcode_t err;
   Idx start, length;
@@ -212,12 +218,8 @@ regexec (const regex_t *__restrict preg, const char *__restrict string,
     }
 
   lock_lock (dfa->lock);
-  if (preg->no_sub)
-    err = re_search_internal (preg, string, length, start, length,
-			      length, 0, NULL, eflags);
-  else
-    err = re_search_internal (preg, string, length, start, length,
-			      length, nmatch, pmatch, eflags);
+  err = re_search_internal (preg, string, length, start, length,
+			    length, preg->no_sub ? 0 : nmatch, pmatch, eflags);
   lock_unlock (dfa->lock);
   return err != REG_NOERROR;
 }
@@ -235,7 +237,7 @@ int
 attribute_compat_text_section
 __compat_regexec (const regex_t *__restrict preg,
 		  const char *__restrict string, size_t nmatch,
-		  regmatch_t pmatch[], int eflags)
+		  regmatch_t pmatch[__ARG_NELTS (nmatch)], int eflags)
 {
   return regexec (preg, string, nmatch, pmatch,
 		  eflags & (REG_NOTBOL | REG_NOTEOL));
@@ -434,7 +436,7 @@ re_search_stub (struct re_pattern_buffer *bufp, const char *string, Idx length,
   else if (regs != NULL)
     {
       /* If caller wants register contents data back, copy them.  */
-      bufp->regs_allocated = re_copy_regs (regs, pmatch, nregs,
+      bufp->regs_allocated = re_copy_regs (regs, nregs, pmatch,
 					   bufp->regs_allocated);
       if (__glibc_unlikely (bufp->regs_allocated == REGS_UNALLOCATED))
 	rval = -2;
@@ -457,7 +459,8 @@ re_search_stub (struct re_pattern_buffer *bufp, const char *string, Idx length,
 }
 
 static unsigned
-re_copy_regs (struct re_registers *regs, regmatch_t *pmatch, Idx nregs,
+re_copy_regs (struct re_registers *regs, Idx nregs,
+	      regmatch_t pmatch[__ARG_NELTS (static nregs)],
 	      int regs_allocated)
 {
   int rval = REGS_REALLOCATE;
@@ -585,7 +588,8 @@ static reg_errcode_t
 __attribute_warn_unused_result__
 re_search_internal (const regex_t *preg, const char *string, Idx length,
 		    Idx start, Idx last_start, Idx stop, size_t nmatch,
-		    regmatch_t pmatch[], int eflags)
+		    regmatch_t pmatch[__ARG_NELTS (static nmatch)],
+		    int eflags)
 {
   reg_errcode_t err;
   const re_dfa_t *dfa = preg->buffer;
@@ -1210,8 +1214,9 @@ check_halt_state_context (const re_match_context_t *mctx,
    return -1 on match failure, -2 on error.  */
 
 static Idx
-proceed_next_node (const re_match_context_t *mctx, Idx nregs, regmatch_t *regs,
-		   regmatch_t *prevregs,
+proceed_next_node (const re_match_context_t *mctx, Idx nregs,
+		   regmatch_t regs[__ARG_NELTS (static nregs)],
+		   regmatch_t prevregs[__ARG_NELTS (static nregs)],
 		   Idx *pidx, Idx node, re_node_set *eps_via_nodes,
 		   struct re_fail_stack_t *fs)
 {
@@ -1321,7 +1326,9 @@ proceed_next_node (const re_match_context_t *mctx, Idx nregs, regmatch_t *regs,
 static reg_errcode_t
 __attribute_warn_unused_result__
 push_fail_stack (struct re_fail_stack_t *fs, Idx str_idx, Idx dest_node,
-		 Idx nregs, regmatch_t *regs, regmatch_t *prevregs,
+		 Idx nregs,
+		 regmatch_t regs[__ARG_NELTS (static nregs)],
+		 regmatch_t prevregs[__ARG_NELTS (static nregs)],
 		 re_node_set *eps_via_nodes)
 {
   reg_errcode_t err;
@@ -1349,7 +1356,8 @@ push_fail_stack (struct re_fail_stack_t *fs, Idx str_idx, Idx dest_node,
 
 static Idx
 pop_fail_stack (struct re_fail_stack_t *fs, Idx *pidx, Idx nregs,
-		regmatch_t *regs, regmatch_t *prevregs,
+		regmatch_t regs[__ARG_NELTS (static nregs)],
+		regmatch_t prevregs[__ARG_NELTS (static nregs)],
 		re_node_set *eps_via_nodes)
 {
   if (fs == NULL || fs->num == 0)
@@ -1379,7 +1387,7 @@ pop_fail_stack (struct re_fail_stack_t *fs, Idx *pidx, Idx nregs,
 static reg_errcode_t
 __attribute_warn_unused_result__
 set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch,
-	  regmatch_t *pmatch, bool fl_backtrack)
+	  regmatch_t pmatch[__ARG_NELTS (static nmatch)], bool fl_backtrack)
 {
   const re_dfa_t *dfa = preg->buffer;
   Idx idx, cur_node;
@@ -1415,7 +1423,7 @@ set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch,
 
   for (idx = pmatch[0].rm_so; idx <= pmatch[0].rm_eo ;)
     {
-      update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch);
+      update_regs (dfa, nmatch, pmatch, prev_idx_match, cur_node, idx);
 
       if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node)
 	  || (fs && re_node_set_contains (&eps_via_nodes, cur_node)))
@@ -1487,8 +1495,10 @@ free_fail_stack_return (struct re_fail_stack_t *fs)
 }
 
 static void
-update_regs (const re_dfa_t *dfa, regmatch_t *pmatch,
-	     regmatch_t *prev_idx_match, Idx cur_node, Idx cur_idx, Idx nmatch)
+update_regs (const re_dfa_t *dfa, Idx nmatch,
+	     regmatch_t pmatch[__ARG_NELTS (static nmatch)],
+	     regmatch_t prev_idx_match[__ARG_NELTS (static nmatch)],
+	     Idx cur_node, Idx cur_idx)
 {
   int type = dfa->nodes[cur_node].type;
   if (type == OP_OPEN_SUBEXP)
-- 
2.31.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-08-22  5:06             ` Paul Eggert
@ 2021-08-26 15:06               ` Martin Sebor
  2021-08-26 16:24                 ` Paul Eggert
  0 siblings, 1 reply; 22+ messages in thread
From: Martin Sebor @ 2021-08-26 15:06 UTC (permalink / raw)
  To: Paul Eggert; +Cc: GNU C Library

On 8/21/21 11:06 PM, Paul Eggert wrote:
> On 8/19/21 4:50 PM, Martin Sebor wrote:
> 
>> Is <regex.h> for some reason not considered a system header in your
>> test environment?
> 
> Yes, because Coreutils etc. currently use Gnulib regex.h instead of 
> /usr/include/regex.h, as Glibc regex has a bug and the fix (in Gnulib) 
> hasn't made its way back to Glibc yet.
> 
> This sort of thing is all too common, as regex development (such as it 
> is) often occurs in Gnulib first, which means it's common for 
> 'configure' to discover that the current Glibc has a bug and to 
> substitute Gnulib regex instead. And when we compile Gnulib headers we 
> don't consider them to be system headers, because we want the extra 
> static checking that gcc does for non system headers.
> 
> I think the simplest workaround for the warning is to disable -Wvla for 
> that particular declaration. Please see attached Gnulib patch for how I 
> suggest doing this.
> 
>> I didn't know mixing and matching two implementations like this
>> was even possible.  Thanks for explaining it (though it seems
>> like a pretty cumbersome arrangement).
> 
> Yes, it is a bit awkward. I am tempted to clean it up but that really 
> should be a separate patch and so I will focus only on this thread's issue.
> 
>> +#ifndef __ARG_NELTS
>> +# ifdef __ARG_NELTS
> 
> A clear typo. The suggestion was to define and use _ARG_NELTS_ instead.

__ARG_NELTS is  the name of macro in cdefs.h that you suggested so
this should presumably be the same, no?  (I.e., it's not a typo but
I could be missing something.)

> 
> Some other thoughts.
> 
> If we're going to change regexec, we should change __compat_regex to be 
> consistent.
> 
> We should also change regexec.c's internal functions to be consistent 
> with regexec. This includes re_search_internal, update_regs, etc.

That sounds like a good idea for a follow-up.  My immediate goal is
to avoid the current warning so Glibc can build, and I prefer to make
that change independently of other improvements.

I plan to commit the last revision of the patch as is unless there's
some reason I'm missing that the name of the regex.h macro shouldn't
be the same as that in cdefs.h.  I see you have both __ARG_NELTS and
_ARG_NELTS_ in your Gnulib patch though I don't understand why.

Thanks for all your comments!

Martin

> 
> These internal functions don't have regexec's quirk that pmatch is 
> ignored when not substituting, so their parameters can use the syntax 
> "regmatch_t pmatch[__ARG_NELTS (static nmatch)]" to let the compiler 
> know that the array must be of the given size.
> 
> Proposed Gnulib patch attached. It should be easy to aplly to glibc 
> though glibc also needs its own regex.h patched. I have not installed 
> this into Gnulib yet.
> 
> At some point we should sync Gnulib and Glibc regex of course.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-08-26 15:06               ` Martin Sebor
@ 2021-08-26 16:24                 ` Paul Eggert
  2021-08-26 16:47                   ` Martin Sebor
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2021-08-26 16:24 UTC (permalink / raw)
  To: Martin Sebor; +Cc: GNU C Library

On 8/26/21 8:06 AM, Martin Sebor wrote:
>>
>>> +#ifndef __ARG_NELTS
>>> +# ifdef __ARG_NELTS
>>
>> A clear typo. The suggestion was to define and use _ARG_NELTS_ instead.
> 
> __ARG_NELTS is  the name of macro in cdefs.h that you suggested so
> this should presumably be the same, no?

No matter what the macro name is (let's call it N), it cannot be right 
to do this:

    #ifndef N
    # ifdef N
    ... whatever ...

because the "whatever" cannot be reached. Unfortunately this is what 
your patch does, with N being __ARG_NELTS.

> I see you have both __ARG_NELTS and
> _ARG_NELTS_ in your Gnulib patch though I don't understand why.

It's for the same reason that regex.h already has both __restrict_arr 
(the normal glibc practice) and  _Restrict_arr_ (an alternative spelling 
used only in regex.h). This is because regex.h is used both in 
/usr/include on GNU systems (where __restrict_arr is used) and in Gnulib 
on non-GNU systems that may not have __restrict_arr.

regex.h does not simply #define __restrict_arr if it's not already 
defined, because any such definition might collide with a later 
GNU-system header that defines __restrict_arr in a different way (this 
could happen if Gnulib regex.h is used on a older GNUish platform).

Obviously this is a messy situation and I'd like to clean it up, but as 
I mentioned in my previous email any such cleanup something better left 
for a later patch. In the meantime we should just continue existing 
practice when we add a new macro like __ARG_NELTS.

> I plan to commit the last revision of the patch as is

Please don't do that. That patch still needs work. Instead, please start 
with the patch I sent, and understand what it does. Unless I'm missing 
something it should be able to go into glibc as-is (though you may need 
to edit glibc files that are not in gnulib).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-08-26 16:24                 ` Paul Eggert
@ 2021-08-26 16:47                   ` Martin Sebor
  2021-08-27  8:52                     ` Paul Eggert
  0 siblings, 1 reply; 22+ messages in thread
From: Martin Sebor @ 2021-08-26 16:47 UTC (permalink / raw)
  To: Paul Eggert; +Cc: GNU C Library

On 8/26/21 10:24 AM, Paul Eggert wrote:
> On 8/26/21 8:06 AM, Martin Sebor wrote:
>>>
>>>> +#ifndef __ARG_NELTS
>>>> +# ifdef __ARG_NELTS
>>>
>>> A clear typo. The suggestion was to define and use _ARG_NELTS_ instead.
>>
>> __ARG_NELTS is  the name of macro in cdefs.h that you suggested so
>> this should presumably be the same, no?
> 
> No matter what the macro name is (let's call it N), it cannot be right 
> to do this:
> 
>     #ifndef N
>     # ifdef N
>     ... whatever ...
> 
> because the "whatever" cannot be reached. Unfortunately this is what 
> your patch does, with N being __ARG_NELTS.

Aaah.  I took what you had in your email and changed it to
__ARG_NELTS thinking you had a typo there because you had
suggested __ARG_NELTS before, and introduced a typo of my own.

> 
>> I see you have both __ARG_NELTS and
>> _ARG_NELTS_ in your Gnulib patch though I don't understand why.
> 
> It's for the same reason that regex.h already has both __restrict_arr 
> (the normal glibc practice) and  _Restrict_arr_ (an alternative spelling 
> used only in regex.h). This is because regex.h is used both in 
> /usr/include on GNU systems (where __restrict_arr is used) and in Gnulib 
> on non-GNU systems that may not have __restrict_arr.
> 
> regex.h does not simply #define __restrict_arr if it's not already 
> defined, because any such definition might collide with a later 
> GNU-system header that defines __restrict_arr in a different way (this 
> could happen if Gnulib regex.h is used on a older GNUish platform).
> 
> Obviously this is a messy situation and I'd like to clean it up, but as 
> I mentioned in my previous email any such cleanup something better left 
> for a later patch. In the meantime we should just continue existing 
> practice when we add a new macro like __ARG_NELTS.
> 
>> I plan to commit the last revision of the patch as is
> 
> Please don't do that. That patch still needs work. Instead, please start 
> with the patch I sent, and understand what it does. Unless I'm missing 
> something it should be able to go into glibc as-is (though you may need 
> to edit glibc files that are not in gnulib).

I think I'd prefer to let you do this.  It's far more complicated
and extensive than it needs to be to silence just the one warning.

Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-08-26 16:47                   ` Martin Sebor
@ 2021-08-27  8:52                     ` Paul Eggert
  2021-08-27 16:34                       ` Martin Sebor
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2021-08-27  8:52 UTC (permalink / raw)
  To: Martin Sebor; +Cc: GNU C Library

[-- Attachment #1: Type: text/plain, Size: 363 bytes --]

On 8/26/21 9:47 AM, Martin Sebor wrote:
> I think I'd prefer to let you do this.

OK, proposed glibc patch attached. As I lack the time to shepherd each 
regex fix into Glibc separately, I simply copied regex-relevant source 
files from Gnulib to Glibc, after merging recent Glibc changes (plus 
your proposal) into Gnulib.

Please give the patch a try.

[-- Attachment #2: 0001-regex-copy-back-from-Gnulib.patch --]
[-- Type: text/x-patch, Size: 24843 bytes --]

From d4f717debf113af1555f1cef5350eed321b7b379 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 27 Aug 2021 01:17:41 -0700
Subject: [PATCH] regex: copy back from Gnulib

Copy regex-related files back from Gnulib, to fix a problem with
static checking of regex calls noted by Martin Sebor.  This merges the
following changes:

* New macro __attribute_nonnull__ in misc/sys/cdefs.h, for use later
when copying other files back from Gnulib.

* Use __GNULIB_CDEFS instead of __GLIBC__ when deciding
whether to include bits/wordsize.h etc.

* Avoid duplicate entries in epsilon closure table.

* New regex.h macro _REGEX_NELTS to let regexec say that its pmatch
arg should contain nmatch elts.  Use that for regexec, instead of
__attr_access (which is incorrect).

* New regex.h macro _Attr_access_ which is like __attr_access except
portable to non-glibc platforms.

* Add some DEBUG_ASSERTs to pacify gcc -fanalyzer and to catch
recently-fixed performance bugs if they recur.

* Add Gnulib-specific stuff to port the dynarray- and lock-using parts
of regex code to non-glibc platforms.

* Fix glibc bug 11053.

* Avoid some undefined behavior when popping an empty fail stack.
---
 include/intprops.h     |  18 ++++++--
 include/regex.h        |   3 +-
 misc/sys/cdefs.h       |  22 +++++----
 posix/regcomp.c        |   8 ++--
 posix/regex.c          |   1 +
 posix/regex.h          |  49 +++++++++++++++-----
 posix/regex_internal.c |  10 +++-
 posix/regex_internal.h |   8 +++-
 posix/regexec.c        | 101 +++++++++++++++++++++++------------------
 9 files changed, 142 insertions(+), 78 deletions(-)

diff --git a/include/intprops.h b/include/intprops.h
index 967e32ea0c..9d10028a59 100644
--- a/include/intprops.h
+++ b/include/intprops.h
@@ -133,7 +133,8 @@
    operators might not yield numerically correct answers due to
    arithmetic overflow.  They do not rely on undefined or
    implementation-defined behavior.  Their implementations are simple
-   and straightforward, but they are a bit harder to use than the
+   and straightforward, but they are harder to use and may be less
+   efficient than the INT_<op>_WRAPV, INT_<op>_OK, and
    INT_<op>_OVERFLOW macros described below.
 
    Example usage:
@@ -158,6 +159,9 @@
    must have minimum value MIN and maximum MAX.  Unsigned types should
    use a zero MIN of the proper type.
 
+   Because all arguments are subject to integer promotions, these
+   macros typically do not work on types narrower than 'int'.
+
    These macros are tuned for constant MIN and MAX.  For commutative
    operations such as A + B, they are also tuned for constant B.  */
 
@@ -339,9 +343,15 @@
    arguments should not have side effects.
 
    The WRAPV macros are not constant expressions.  They support only
-   +, binary -, and *.  Because the WRAPV macros convert the result,
-   they report overflow in different circumstances than the OVERFLOW
-   macros do.
+   +, binary -, and *.
+
+   Because the WRAPV macros convert the result, they report overflow
+   in different circumstances than the OVERFLOW macros do.  For
+   example, in the typical case with 16-bit 'short' and 32-bit 'int',
+   if A, B and R are all of type 'short' then INT_ADD_OVERFLOW (A, B)
+   returns false because the addition cannot overflow after A and B
+   are converted to 'int', whereas INT_ADD_WRAPV (A, B, &R) returns
+   true or false depending on whether the sum fits into 'short'.
 
    These macros are tuned for their last input argument being a constant.
 
diff --git a/include/regex.h b/include/regex.h
index 24eca2c297..34fb67d855 100644
--- a/include/regex.h
+++ b/include/regex.h
@@ -37,7 +37,8 @@ extern int __regcomp (regex_t *__preg, const char *__pattern, int __cflags);
 libc_hidden_proto (__regcomp)
 
 extern int __regexec (const regex_t *__preg, const char *__string,
-		      size_t __nmatch, regmatch_t __pmatch[], int __eflags);
+		      size_t __nmatch, regmatch_t __pmatch[__nmatch],
+		      int __eflags);
 libc_hidden_proto (__regexec)
 
 extern size_t __regerror (int __errcode, const regex_t *__preg,
diff --git a/misc/sys/cdefs.h b/misc/sys/cdefs.h
index e490fc1aeb..4dac9d264d 100644
--- a/misc/sys/cdefs.h
+++ b/misc/sys/cdefs.h
@@ -318,16 +318,18 @@
 #endif
 
 /* The nonnull function attribute marks pointer parameters that
-   must not be NULL.  */
-#ifndef __nonnull
+   must not be NULL.  This has the name __nonnull in glibc,
+   and __attribute_nonnull__ in files shared with Gnulib to avoid
+   collision with a different __nonnull in DragonFlyBSD 5.9.  */
+#ifndef __attribute_nonnull__
 # if __GNUC_PREREQ (3,3) || __glibc_has_attribute (__nonnull__)
-#  define __nonnull(params) __attribute__ ((__nonnull__ params))
+#  define __attribute_nonnull__(params) __attribute__ ((__nonnull__ params))
 # else
-#  define __nonnull(params)
+#  define __attribute_nonnull__(params)
 # endif
-#elif !defined __GLIBC__
-# undef __nonnull
-# define __nonnull(params) _GL_ATTRIBUTE_NONNULL (params)
+#endif
+#ifndef __nonnull
+# define __nonnull(params) __attribute_nonnull__ (params)
 #endif
 
 /* The returns_nonnull function attribute marks the return type of the function
@@ -493,9 +495,9 @@
       [!!sizeof (struct { int __error_if_negative: (expr) ? 2 : -1; })]
 #endif
 
-/* The #ifndef lets Gnulib avoid including these on non-glibc
-   platforms, where the includes typically do not exist.  */
-#ifdef __GLIBC__
+/* Gnulib avoids including these, as they don't work on non-glibc or
+   older glibc platforms.  */
+#ifndef __GNULIB_CDEFS
 # include <bits/wordsize.h>
 # include <bits/long-double.h>
 #endif
diff --git a/posix/regcomp.c b/posix/regcomp.c
index d93698ae78..887e5b5068 100644
--- a/posix/regcomp.c
+++ b/posix/regcomp.c
@@ -1695,12 +1695,14 @@ calc_eclosure_iter (re_node_set *new_set, re_dfa_t *dfa, Idx node, bool root)
   reg_errcode_t err;
   Idx i;
   re_node_set eclosure;
-  bool ok;
   bool incomplete = false;
   err = re_node_set_alloc (&eclosure, dfa->edests[node].nelem + 1);
   if (__glibc_unlikely (err != REG_NOERROR))
     return err;
 
+  /* An epsilon closure includes itself.  */
+  eclosure.elems[eclosure.nelem++] = node;
+
   /* This indicates that we are calculating this node now.
      We reference this value to avoid infinite loop.  */
   dfa->eclosures[node].nelem = -1;
@@ -1753,10 +1755,6 @@ calc_eclosure_iter (re_node_set *new_set, re_dfa_t *dfa, Idx node, bool root)
 	  }
       }
 
-  /* An epsilon closure includes itself.  */
-  ok = re_node_set_insert (&eclosure, node);
-  if (__glibc_unlikely (! ok))
-    return REG_ESPACE;
   if (incomplete && !root)
     dfa->eclosures[node].nelem = 0;
   else
diff --git a/posix/regex.c b/posix/regex.c
index 7296be0f08..d32863972c 100644
--- a/posix/regex.c
+++ b/posix/regex.c
@@ -24,6 +24,7 @@
 
 # if __GNUC_PREREQ (4, 6)
 #  pragma GCC diagnostic ignored "-Wsuggest-attribute=pure"
+#  pragma GCC diagnostic ignored "-Wvla"
 # endif
 # if __GNUC_PREREQ (4, 3)
 #  pragma GCC diagnostic ignored "-Wold-style-definition"
diff --git a/posix/regex.h b/posix/regex.h
index 14fb1d8364..adb69768ee 100644
--- a/posix/regex.h
+++ b/posix/regex.h
@@ -522,6 +522,30 @@ typedef struct
 \f
 /* Declarations for routines.  */
 
+#ifndef _REGEX_NELTS
+# if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \
+	&& !defined __STDC_NO_VLA__)
+#  define _REGEX_NELTS(n) n
+# else
+#  define _REGEX_NELTS(n)
+# endif
+#endif
+
+#if defined __GNUC__ && 4 < __GNUC__ + (6 <= __GNUC_MINOR__)
+# pragma GCC diagnostic push
+# pragma GCC diagnostic ignored "-Wvla"
+#endif
+
+#ifndef _Attr_access_
+# ifdef __attr_access
+#  define _Attr_access_(arg) __attr_access (arg)
+# elif defined __GNUC__ && 10 <= __GNUC__
+#  define _Attr_access_(x) __attribute__ ((__access__ x))
+# else
+#  define _Attr_access_(x)
+# endif
+#endif
+
 #ifdef __USE_GNU
 /* Sets the current default syntax to SYNTAX, and return the old syntax.
    You can also simply assign to the 're_syntax_options' variable.  */
@@ -537,7 +561,7 @@ extern reg_syntax_t re_set_syntax (reg_syntax_t __syntax);
    'regfree'.  */
 extern const char *re_compile_pattern (const char *__pattern, size_t __length,
 				       struct re_pattern_buffer *__buffer)
-    __attr_access ((__read_only__, 1, 2));
+    _Attr_access_ ((__read_only__, 1, 2));
 
 
 /* Compile a fastmap for the compiled pattern in BUFFER; used to
@@ -555,7 +579,7 @@ extern regoff_t re_search (struct re_pattern_buffer *__buffer,
 			   const char *__String, regoff_t __length,
 			   regoff_t __start, regoff_t __range,
 			   struct re_registers *__regs)
-    __attr_access ((__read_only__, 2, 3));
+    _Attr_access_ ((__read_only__, 2, 3));
 
 
 /* Like 're_search', but search in the concatenation of STRING1 and
@@ -566,8 +590,8 @@ extern regoff_t re_search_2 (struct re_pattern_buffer *__buffer,
 			     regoff_t __start, regoff_t __range,
 			     struct re_registers *__regs,
 			     regoff_t __stop)
-    __attr_access ((__read_only__, 2, 3))
-    __attr_access ((__read_only__, 4, 5));
+    _Attr_access_ ((__read_only__, 2, 3))
+    _Attr_access_ ((__read_only__, 4, 5));
 
 
 /* Like 're_search', but return how many characters in STRING the regexp
@@ -575,7 +599,7 @@ extern regoff_t re_search_2 (struct re_pattern_buffer *__buffer,
 extern regoff_t re_match (struct re_pattern_buffer *__buffer,
 			  const char *__String, regoff_t __length,
 			  regoff_t __start, struct re_registers *__regs)
-    __attr_access ((__read_only__, 2, 3));
+    _Attr_access_ ((__read_only__, 2, 3));
 
 
 /* Relates to 're_match' as 're_search_2' relates to 're_search'.  */
@@ -584,8 +608,8 @@ extern regoff_t re_match_2 (struct re_pattern_buffer *__buffer,
 			    const char *__string2, regoff_t __length2,
 			    regoff_t __start, struct re_registers *__regs,
 			    regoff_t __stop)
-    __attr_access ((__read_only__, 2, 3))
-    __attr_access ((__read_only__, 4, 5));
+    _Attr_access_ ((__read_only__, 2, 3))
+    _Attr_access_ ((__read_only__, 4, 5));
 
 
 /* Set REGS to hold NUM_REGS registers, storing them in STARTS and
@@ -654,16 +678,19 @@ extern int regcomp (regex_t *_Restrict_ __preg,
 
 extern int regexec (const regex_t *_Restrict_ __preg,
 		    const char *_Restrict_ __String, size_t __nmatch,
-		    regmatch_t __pmatch[_Restrict_arr_],
-		    int __eflags)
-    __attr_access ((__write_only__, 4, 3));
+		    regmatch_t __pmatch[_Restrict_arr_
+					_REGEX_NELTS (__nmatch)],
+		    int __eflags);
 
 extern size_t regerror (int __errcode, const regex_t *_Restrict_ __preg,
 			char *_Restrict_ __errbuf, size_t __errbuf_size)
-    __attr_access ((__write_only__, 3, 4));
+    _Attr_access_ ((__write_only__, 3, 4));
 
 extern void regfree (regex_t *__preg);
 
+#if defined __GNUC__ && 4 < __GNUC__ + (6 <= __GNUC_MINOR__)
+# pragma GCC diagnostic pop
+#endif
 
 #ifdef __cplusplus
 }
diff --git a/posix/regex_internal.c b/posix/regex_internal.c
index 9dd387ef85..aefcfa2f52 100644
--- a/posix/regex_internal.c
+++ b/posix/regex_internal.c
@@ -1211,6 +1211,10 @@ re_node_set_merge (re_node_set *dest, const re_node_set *src)
 
   if (__glibc_unlikely (dest->nelem == 0))
     {
+      /* Although we already guaranteed above that dest->alloc != 0 and
+         therefore dest->elems != NULL, add a debug assertion to pacify
+         GCC 11.2.1's -fanalyzer.  */
+      DEBUG_ASSERT (dest->elems);
       dest->nelem = src->nelem;
       memcpy (dest->elems, src->elems, src->nelem * sizeof (Idx));
       return REG_NOERROR;
@@ -1286,7 +1290,10 @@ re_node_set_insert (re_node_set *set, Idx elem)
 
   if (__glibc_unlikely (set->nelem) == 0)
     {
-      /* We already guaranteed above that set->alloc != 0.  */
+      /* Although we already guaranteed above that set->alloc != 0 and
+         therefore set->elems != NULL, add a debug assertion to pacify
+         GCC 11.2 -fanalyzer.  */
+      DEBUG_ASSERT (set->elems);
       set->elems[0] = elem;
       ++set->nelem;
       return true;
@@ -1314,6 +1321,7 @@ re_node_set_insert (re_node_set *set, Idx elem)
     {
       for (idx = set->nelem; set->elems[idx - 1] > elem; idx--)
 	set->elems[idx] = set->elems[idx - 1];
+      DEBUG_ASSERT (set->elems[idx - 1] < elem);
     }
 
   /* Insert the new element.  */
diff --git a/posix/regex_internal.h b/posix/regex_internal.h
index edcdc07e99..1245e782ff 100644
--- a/posix/regex_internal.h
+++ b/posix/regex_internal.h
@@ -32,6 +32,10 @@
 #include <stdbool.h>
 #include <stdint.h>
 
+#ifndef _LIBC
+# include <dynarray.h>
+#endif
+
 #include <intprops.h>
 #include <verify.h>
 
@@ -49,14 +53,14 @@
 # define lock_fini(lock) ((void) 0)
 # define lock_lock(lock) __libc_lock_lock (lock)
 # define lock_unlock(lock) __libc_lock_unlock (lock)
-#elif defined GNULIB_LOCK && !defined USE_UNLOCKED_IO
+#elif defined GNULIB_LOCK && !defined GNULIB_REGEX_SINGLE_THREAD
 # include "glthread/lock.h"
 # define lock_define(name) gl_lock_define (, name)
 # define lock_init(lock) glthread_lock_init (&(lock))
 # define lock_fini(lock) glthread_lock_destroy (&(lock))
 # define lock_lock(lock) glthread_lock_lock (&(lock))
 # define lock_unlock(lock) glthread_lock_unlock (&(lock))
-#elif defined GNULIB_PTHREAD && !defined USE_UNLOCKED_IO
+#elif defined GNULIB_PTHREAD && !defined GNULIB_REGEX_SINGLE_THREAD
 # include <pthread.h>
 # define lock_define(name) pthread_mutex_t name;
 # define lock_init(lock) pthread_mutex_init (&(lock), 0)
diff --git a/posix/regexec.c b/posix/regexec.c
index f7b4f9cfc3..83e9aaf8ca 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -59,7 +59,7 @@ static void update_regs (const re_dfa_t *dfa, regmatch_t *pmatch,
 			 Idx cur_idx, Idx nmatch);
 static reg_errcode_t push_fail_stack (struct re_fail_stack_t *fs,
 				      Idx str_idx, Idx dest_node, Idx nregs,
-				      regmatch_t *regs,
+				      regmatch_t *regs, regmatch_t *prevregs,
 				      re_node_set *eps_via_nodes);
 static reg_errcode_t set_regs (const regex_t *preg,
 			       const re_match_context_t *mctx,
@@ -186,11 +186,12 @@ static reg_errcode_t extend_buffers (re_match_context_t *mctx, int min_len);
    REG_NOTBOL is set, then ^ does not match at the beginning of the
    string; if REG_NOTEOL is set, then $ does not match at the end.
 
-   We return 0 if we find a match and REG_NOMATCH if not.  */
+   Return 0 if a match is found, REG_NOMATCH if not, REG_BADPAT if
+   EFLAGS is invalid.  */
 
 int
 regexec (const regex_t *__restrict preg, const char *__restrict string,
-	 size_t nmatch, regmatch_t pmatch[], int eflags)
+	 size_t nmatch, regmatch_t pmatch[_REGEX_NELTS (nmatch)], int eflags)
 {
   reg_errcode_t err;
   Idx start, length;
@@ -234,7 +235,7 @@ int
 attribute_compat_text_section
 __compat_regexec (const regex_t *__restrict preg,
 		  const char *__restrict string, size_t nmatch,
-		  regmatch_t pmatch[], int eflags)
+		  regmatch_t pmatch[_REGEX_NELTS (nmatch)], int eflags)
 {
   return regexec (preg, string, nmatch, pmatch,
 		  eflags & (REG_NOTBOL | REG_NOTEOL));
@@ -269,8 +270,8 @@ compat_symbol (libc, __compat_regexec, regexec, GLIBC_2_0);
    strings.)
 
    On success, re_match* functions return the length of the match, re_search*
-   return the position of the start of the match.  Return value -1 means no
-   match was found and -2 indicates an internal error.  */
+   return the position of the start of the match.  They return -1 on
+   match failure, -2 on error.  */
 
 regoff_t
 re_match (struct re_pattern_buffer *bufp, const char *string, Idx length,
@@ -1206,27 +1207,30 @@ check_halt_state_context (const re_match_context_t *mctx,
 /* Compute the next node to which "NFA" transit from NODE("NFA" is a NFA
    corresponding to the DFA).
    Return the destination node, and update EPS_VIA_NODES;
-   return -1 in case of errors.  */
+   return -1 on match failure, -2 on error.  */
 
 static Idx
 proceed_next_node (const re_match_context_t *mctx, Idx nregs, regmatch_t *regs,
+		   regmatch_t *prevregs,
 		   Idx *pidx, Idx node, re_node_set *eps_via_nodes,
 		   struct re_fail_stack_t *fs)
 {
   const re_dfa_t *const dfa = mctx->dfa;
-  Idx i;
-  bool ok;
   if (IS_EPSILON_NODE (dfa->nodes[node].type))
     {
       re_node_set *cur_nodes = &mctx->state_log[*pidx]->nodes;
       re_node_set *edests = &dfa->edests[node];
-      Idx dest_node;
-      ok = re_node_set_insert (eps_via_nodes, node);
-      if (__glibc_unlikely (! ok))
-	return -2;
-      /* Pick up a valid destination, or return -1 if none
-	 is found.  */
-      for (dest_node = -1, i = 0; i < edests->nelem; ++i)
+
+      if (! re_node_set_contains (eps_via_nodes, node))
+        {
+          bool ok = re_node_set_insert (eps_via_nodes, node);
+          if (__glibc_unlikely (! ok))
+            return -2;
+        }
+
+      /* Pick a valid destination, or return -1 if none is found.  */
+      Idx dest_node = -1;
+      for (Idx i = 0; i < edests->nelem; i++)
 	{
 	  Idx candidate = edests->elems[i];
 	  if (!re_node_set_contains (cur_nodes, candidate))
@@ -1244,7 +1248,7 @@ proceed_next_node (const re_match_context_t *mctx, Idx nregs, regmatch_t *regs,
 	      /* Otherwise, push the second epsilon-transition on the fail stack.  */
 	      else if (fs != NULL
 		       && push_fail_stack (fs, *pidx, candidate, nregs, regs,
-					   eps_via_nodes))
+					   prevregs, eps_via_nodes))
 		return -2;
 
 	      /* We know we are going to exit.  */
@@ -1288,7 +1292,7 @@ proceed_next_node (const re_match_context_t *mctx, Idx nregs, regmatch_t *regs,
 	  if (naccepted == 0)
 	    {
 	      Idx dest_node;
-	      ok = re_node_set_insert (eps_via_nodes, node);
+	      bool ok = re_node_set_insert (eps_via_nodes, node);
 	      if (__glibc_unlikely (! ok))
 		return -2;
 	      dest_node = dfa->edests[node].elems[0];
@@ -1317,7 +1321,8 @@ proceed_next_node (const re_match_context_t *mctx, Idx nregs, regmatch_t *regs,
 static reg_errcode_t
 __attribute_warn_unused_result__
 push_fail_stack (struct re_fail_stack_t *fs, Idx str_idx, Idx dest_node,
-		 Idx nregs, regmatch_t *regs, re_node_set *eps_via_nodes)
+		 Idx nregs, regmatch_t *regs, regmatch_t *prevregs,
+		 re_node_set *eps_via_nodes)
 {
   reg_errcode_t err;
   Idx num = fs->num++;
@@ -1333,25 +1338,30 @@ push_fail_stack (struct re_fail_stack_t *fs, Idx str_idx, Idx dest_node,
     }
   fs->stack[num].idx = str_idx;
   fs->stack[num].node = dest_node;
-  fs->stack[num].regs = re_malloc (regmatch_t, nregs);
+  fs->stack[num].regs = re_malloc (regmatch_t, 2 * nregs);
   if (fs->stack[num].regs == NULL)
     return REG_ESPACE;
   memcpy (fs->stack[num].regs, regs, sizeof (regmatch_t) * nregs);
+  memcpy (fs->stack[num].regs + nregs, prevregs, sizeof (regmatch_t) * nregs);
   err = re_node_set_init_copy (&fs->stack[num].eps_via_nodes, eps_via_nodes);
   return err;
 }
 
 static Idx
 pop_fail_stack (struct re_fail_stack_t *fs, Idx *pidx, Idx nregs,
-		regmatch_t *regs, re_node_set *eps_via_nodes)
+		regmatch_t *regs, regmatch_t *prevregs,
+		re_node_set *eps_via_nodes)
 {
+  if (fs == NULL || fs->num == 0)
+    return -1;
   Idx num = --fs->num;
-  DEBUG_ASSERT (num >= 0);
   *pidx = fs->stack[num].idx;
   memcpy (regs, fs->stack[num].regs, sizeof (regmatch_t) * nregs);
+  memcpy (prevregs, fs->stack[num].regs + nregs, sizeof (regmatch_t) * nregs);
   re_node_set_free (eps_via_nodes);
   re_free (fs->stack[num].regs);
   *eps_via_nodes = fs->stack[num].eps_via_nodes;
+  DEBUG_ASSERT (0 <= fs->stack[num].node);
   return fs->stack[num].node;
 }
 
@@ -1407,33 +1417,32 @@ set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch,
     {
       update_regs (dfa, pmatch, prev_idx_match, cur_node, idx, nmatch);
 
-      if (idx == pmatch[0].rm_eo && cur_node == mctx->last_node)
+      if ((idx == pmatch[0].rm_eo && cur_node == mctx->last_node)
+	  || (fs && re_node_set_contains (&eps_via_nodes, cur_node)))
 	{
 	  Idx reg_idx;
+	  cur_node = -1;
 	  if (fs)
 	    {
 	      for (reg_idx = 0; reg_idx < nmatch; ++reg_idx)
 		if (pmatch[reg_idx].rm_so > -1 && pmatch[reg_idx].rm_eo == -1)
-		  break;
-	      if (reg_idx == nmatch)
-		{
-		  re_node_set_free (&eps_via_nodes);
-		  regmatch_list_free (&prev_match);
-		  return free_fail_stack_return (fs);
-		}
-	      cur_node = pop_fail_stack (fs, &idx, nmatch, pmatch,
-					 &eps_via_nodes);
+		  {
+		    cur_node = pop_fail_stack (fs, &idx, nmatch, pmatch,
+					       prev_idx_match, &eps_via_nodes);
+		    break;
+		  }
 	    }
-	  else
+	  if (cur_node < 0)
 	    {
 	      re_node_set_free (&eps_via_nodes);
 	      regmatch_list_free (&prev_match);
-	      return REG_NOERROR;
+	      return free_fail_stack_return (fs);
 	    }
 	}
 
       /* Proceed to next node.  */
-      cur_node = proceed_next_node (mctx, nmatch, pmatch, &idx, cur_node,
+      cur_node = proceed_next_node (mctx, nmatch, pmatch, prev_idx_match,
+				    &idx, cur_node,
 				    &eps_via_nodes, fs);
 
       if (__glibc_unlikely (cur_node < 0))
@@ -1445,13 +1454,13 @@ set_regs (const regex_t *preg, const re_match_context_t *mctx, size_t nmatch,
 	      free_fail_stack_return (fs);
 	      return REG_ESPACE;
 	    }
-	  if (fs)
-	    cur_node = pop_fail_stack (fs, &idx, nmatch, pmatch,
-				       &eps_via_nodes);
-	  else
+	  cur_node = pop_fail_stack (fs, &idx, nmatch, pmatch,
+				     prev_idx_match, &eps_via_nodes);
+	  if (cur_node < 0)
 	    {
 	      re_node_set_free (&eps_via_nodes);
 	      regmatch_list_free (&prev_match);
+	      free_fail_stack_return (fs);
 	      return REG_NOMATCH;
 	    }
 	}
@@ -1495,10 +1504,10 @@ update_regs (const re_dfa_t *dfa, regmatch_t *pmatch,
     }
   else if (type == OP_CLOSE_SUBEXP)
     {
+      /* We are at the last node of this sub expression.  */
       Idx reg_num = dfa->nodes[cur_node].opr.idx + 1;
       if (reg_num < nmatch)
 	{
-	  /* We are at the last node of this sub expression.  */
 	  if (pmatch[reg_num].rm_so < cur_idx)
 	    {
 	      pmatch[reg_num].rm_eo = cur_idx;
@@ -2195,6 +2204,7 @@ sift_states_iter_mb (const re_match_context_t *mctx, re_sift_context_t *sctx,
 
 /* Return the next state to which the current state STATE will transit by
    accepting the current input byte, and update STATE_LOG if necessary.
+   Return NULL on failure.
    If STATE can accept a multibyte char/collating element/back reference
    update the destination of STATE_LOG.  */
 
@@ -2395,7 +2405,7 @@ check_subexp_matching_top (re_match_context_t *mctx, re_node_set *cur_nodes,
 
 #if 0
 /* Return the next state to which the current state STATE will transit by
-   accepting the current input byte.  */
+   accepting the current input byte.  Return NULL on failure.  */
 
 static re_dfastate_t *
 transit_state_sb (reg_errcode_t *err, re_match_context_t *mctx,
@@ -2817,7 +2827,8 @@ find_subexp_node (const re_dfa_t *dfa, const re_node_set *nodes,
 /* Check whether the node TOP_NODE at TOP_STR can arrive to the node
    LAST_NODE at LAST_STR.  We record the path onto PATH since it will be
    heavily reused.
-   Return REG_NOERROR if it can arrive, or REG_NOMATCH otherwise.  */
+   Return REG_NOERROR if it can arrive, REG_NOMATCH if it cannot,
+   REG_ESPACE if memory is exhausted.  */
 
 static reg_errcode_t
 __attribute_warn_unused_result__
@@ -3433,7 +3444,8 @@ build_trtable (const re_dfa_t *dfa, re_dfastate_t *state)
 /* Group all nodes belonging to STATE into several destinations.
    Then for all destinations, set the nodes belonging to the destination
    to DESTS_NODE[i] and set the characters accepted by the destination
-   to DEST_CH[i].  This function return the number of destinations.  */
+   to DEST_CH[i].  Return the number of destinations if successful,
+   -1 on internal error.  */
 
 static Idx
 group_nodes_into_DFAstates (const re_dfa_t *dfa, const re_dfastate_t *state,
@@ -4211,7 +4223,8 @@ match_ctx_add_subtop (re_match_context_t *mctx, Idx node, Idx str_idx)
 }
 
 /* Register the node NODE, whose type is OP_CLOSE_SUBEXP, and which matches
-   at STR_IDX, whose corresponding OP_OPEN_SUBEXP is SUB_TOP.  */
+   at STR_IDX, whose corresponding OP_OPEN_SUBEXP is SUB_TOP.
+   Return the new entry if successful, NULL if memory is exhausted.  */
 
 static re_sub_match_last_t *
 match_ctx_add_sublast (re_sub_match_top_t *subtop, Idx node, Idx str_idx)
-- 
2.31.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-08-27  8:52                     ` Paul Eggert
@ 2021-08-27 16:34                       ` Martin Sebor
  2021-08-27 17:50                         ` Allow #pragma GCC in headers in conformtest [committed] (was: Re: [PATCH v3] remove attribute access from regexec) Joseph Myers
  2021-08-27 18:57                         ` [PATCH v3] remove attribute access from regexec Paul Eggert
  0 siblings, 2 replies; 22+ messages in thread
From: Martin Sebor @ 2021-08-27 16:34 UTC (permalink / raw)
  To: Paul Eggert; +Cc: GNU C Library

On 8/27/21 2:52 AM, Paul Eggert wrote:
> On 8/26/21 9:47 AM, Martin Sebor wrote:
>> I think I'd prefer to let you do this.
> 
> OK, proposed glibc patch attached. As I lack the time to shepherd each 
> regex fix into Glibc separately, I simply copied regex-relevant source 
> files from Gnulib to Glibc, after merging recent Glibc changes (plus 
> your proposal) into Gnulib.
> 
> Please give the patch a try.

The patch compiles fine with the latest GCC trunk (12.0) but it
seems to cause failures in the conformance tests that I don't think
were there before:

FAIL: conform/POSIX/regex.h/conform
FAIL: conform/POSIX2008/regex.h/conform
FAIL: conform/UNIX98/regex.h/conform
FAIL: conform/XOPEN2K/regex.h/conform
FAIL: conform/XOPEN2K8/regex.h/conform
FAIL: conform/XPG4/regex.h/conform
FAIL: conform/XPG42/regex.h/conform

The first one fails because of this (I suspect the others are
the same):

     Namespace violation: "GCC"
     Namespace violation: "diagnostic"
     Namespace violation: "ignored"
     Namespace violation: "pop"
     Namespace violation: "pragma"
     Namespace violation: "push"
FAIL: Namespace of <regex.h>

I'm guessing it's due to a limitation of the conformance test script
that considers the GCC diagnostic pragmas unsafe to use because they
are in the user-namespace (even though they're not subject to macro
expansion).  I'm still not sure I understand why the #pragma is
needed since -Wvla shouldn't trigger in a system header.

Beyond that I only skimmed your patch.  It includes quite a few
changes that don't seem necessary to avoid the GCC warning (BZ #28170)
and that I wouldn't be comfortable making at the same time.  This is
not an objection, just an explanation why I was reluctant to accept
some of your suggestions.

But just to clarify, I meant I would have preferred to just fix
the Glibc warning and let you change the other Glibc internal APIs
and handle the Gnulib integration.  Looks like you decided against
the former in the end but thanks for taking care of the latter.

Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Allow #pragma GCC in headers in conformtest [committed] (was: Re: [PATCH v3] remove attribute access from regexec)
  2021-08-27 16:34                       ` Martin Sebor
@ 2021-08-27 17:50                         ` Joseph Myers
  2021-08-27 18:57                         ` [PATCH v3] remove attribute access from regexec Paul Eggert
  1 sibling, 0 replies; 22+ messages in thread
From: Joseph Myers @ 2021-08-27 17:50 UTC (permalink / raw)
  To: Martin Sebor; +Cc: Paul Eggert, GNU C Library

No "#pragma GCC" pragma allows macro-expansion of its arguments, so no
namespace issues arise from use of such pragmas in installed headers.
Ignore them in conformtest tests of header namespace.

Tested for x86_64, in conjunction with Paul's patch
<https://sourceware.org/pipermail/libc-alpha/2021-August/130571.html>
adding use of such pragmas to installed headers shared with gnulib.

---

On Fri, 27 Aug 2021, Martin Sebor via Libc-alpha wrote:

> I'm guessing it's due to a limitation of the conformance test script
> that considers the GCC diagnostic pragmas unsafe to use because they
> are in the user-namespace (even though they're not subject to macro
> expansion).  I'm still not sure I understand why the #pragma is
> needed since -Wvla shouldn't trigger in a system header.

That conform test issue should be fixed by this (committed) patch (which 
makes sense independent of whether we actually add such pragmas to any 
installed headers, given that use of #pragma GCC there is indeed 
namespace-safe).

diff --git a/conform/conformtest.py b/conform/conformtest.py
index f0405b7186..b0ec8e7ed1 100644
--- a/conform/conformtest.py
+++ b/conform/conformtest.py
@@ -624,6 +624,14 @@ class HeaderTests(object):
                     continue
                 if re.match(r'# [1-9]', line):
                     continue
+                if line.startswith('#pragma GCC '):
+                    # No GCC pragma uses macro expansion, so no
+                    # namespace issues arise from such pragmas.  (Some
+                    # pragmas not in the GCC namespace do macro-expand
+                    # their arguments and so could be affected by
+                    # macros defined by user code including the
+                    # header.)
+                    continue
                 match = re.match(r'#define (.*)', line)
                 if match:
                     self.check_token(bad_tokens, match.group(1))

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-08-27 16:34                       ` Martin Sebor
  2021-08-27 17:50                         ` Allow #pragma GCC in headers in conformtest [committed] (was: Re: [PATCH v3] remove attribute access from regexec) Joseph Myers
@ 2021-08-27 18:57                         ` Paul Eggert
  2021-09-20 20:46                           ` Joseph Myers
  1 sibling, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2021-08-27 18:57 UTC (permalink / raw)
  To: Martin Sebor; +Cc: GNU C Library

On 8/27/21 9:34 AM, Martin Sebor wrote:

> It includes quite a few
> changes that don't seem necessary to avoid the GCC warning (BZ #28170)

True. However, the changes have been tested on the Gnulib side, and I 
feel more comfortable applying them all than doing just some of them. 
It's less work anyway; and the more-work option feels like make-work.

I'll give people more time to comment before installing.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-08-27 18:57                         ` [PATCH v3] remove attribute access from regexec Paul Eggert
@ 2021-09-20 20:46                           ` Joseph Myers
  2021-09-21  6:52                             ` Paul Eggert
  0 siblings, 1 reply; 22+ messages in thread
From: Joseph Myers @ 2021-09-20 20:46 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Martin Sebor, GNU C Library

On Fri, 27 Aug 2021, Paul Eggert wrote:

> On 8/27/21 9:34 AM, Martin Sebor wrote:
> 
> > It includes quite a few
> > changes that don't seem necessary to avoid the GCC warning (BZ #28170)
> 
> True. However, the changes have been tested on the Gnulib side, and I feel
> more comfortable applying them all than doing just some of them. It's less
> work anyway; and the more-work option feels like make-work.
> 
> I'll give people more time to comment before installing.

Any news on this?  Builds with GCC mainline are still falling over with 
the regexec issue (thus hiding any other issues with GCC mainline that 
might have appeared since then and making them harder to bisect).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-09-20 20:46                           ` Joseph Myers
@ 2021-09-21  6:52                             ` Paul Eggert
  2021-09-21 13:48                               ` Joseph Myers
  0 siblings, 1 reply; 22+ messages in thread
From: Paul Eggert @ 2021-09-21  6:52 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Martin Sebor, GNU C Library

On 9/20/21 1:46 PM, Joseph Myers wrote:
> Any news on this?  Builds with GCC mainline are still falling over with
> the regexec issue (thus hiding any other issues with GCC mainline that
> might have appeared since then and making them harder to bisect).

Sounds like I should install that August 27 patch then. It would sync 
those files with Gnulib. OK with you?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-09-21  6:52                             ` Paul Eggert
@ 2021-09-21 13:48                               ` Joseph Myers
  2021-09-21 15:00                                 ` Paul Eggert
  0 siblings, 1 reply; 22+ messages in thread
From: Joseph Myers @ 2021-09-21 13:48 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Martin Sebor, GNU C Library

On Tue, 21 Sep 2021, Paul Eggert wrote:

> On 9/20/21 1:46 PM, Joseph Myers wrote:
> > Any news on this?  Builds with GCC mainline are still falling over with
> > the regexec issue (thus hiding any other issues with GCC mainline that
> > might have appeared since then and making them harder to bisect).
> 
> Sounds like I should install that August 27 patch then. It would sync those
> files with Gnulib. OK with you?

Yes.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-09-21 13:48                               ` Joseph Myers
@ 2021-09-21 15:00                                 ` Paul Eggert
  0 siblings, 0 replies; 22+ messages in thread
From: Paul Eggert @ 2021-09-21 15:00 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Martin Sebor, GNU C Library

On 9/21/21 6:48 AM, Joseph Myers wrote:

>> Sounds like I should install that August 27 patch then. It would sync those
>> files with Gnulib. OK with you?
> 
> Yes.

OK, done. I built with gcc 11.2.1 20210728 (Red Hat 11.2.1-1). GCC 
mainline builds should work now, though I haven't tested that.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-08-19 23:50           ` [PATCH v3] " Martin Sebor
  2021-08-22  5:06             ` Paul Eggert
@ 2021-10-19 16:39             ` Carlos O'Donell
  2021-10-19 17:06               ` Martin Sebor
  1 sibling, 1 reply; 22+ messages in thread
From: Carlos O'Donell @ 2021-10-19 16:39 UTC (permalink / raw)
  To: Martin Sebor, Paul Eggert; +Cc: GNU C Library

On 8/19/21 19:50, Martin Sebor via Libc-alpha wrote:
> Attached is yet another revision of this patch (v3 to let the patch
> tester pick it up).

Is your v3 resolved given the merge from gnulib?

commit 0b5ca7c3e551e5502f3be3b06453324fe8604e82
Author: Paul Eggert <eggert@cs.ucla.edu>
Date:   Tue Sep 21 07:47:45 2021 -0700

    regex: copy back from Gnulib

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] remove attribute access from regexec
  2021-10-19 16:39             ` Carlos O'Donell
@ 2021-10-19 17:06               ` Martin Sebor
  0 siblings, 0 replies; 22+ messages in thread
From: Martin Sebor @ 2021-10-19 17:06 UTC (permalink / raw)
  To: Carlos O'Donell, Paul Eggert; +Cc: GNU C Library

On 10/19/21 10:39 AM, Carlos O'Donell wrote:
> On 8/19/21 19:50, Martin Sebor via Libc-alpha wrote:
>> Attached is yet another revision of this patch (v3 to let the patch
>> tester pick it up).
> 
> Is your v3 resolved given the merge from gnulib?
> 
> commit 0b5ca7c3e551e5502f3be3b06453324fe8604e82
> Author: Paul Eggert <eggert@cs.ucla.edu>
> Date:   Tue Sep 21 07:47:45 2021 -0700
> 
>      regex: copy back from Gnulib
> 

Yes, I believe so.  I have just rebuilt the latest Glibc with
the latest GCC on x86_64-linux.  The build shows no warnings.

Thanks
Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-10-19 17:07 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-13 18:26 [PATCH] remove attribute access from regexec Martin Sebor
2021-08-13 20:11 ` Paul Eggert
2021-08-13 21:30   ` Martin Sebor
2021-08-13 22:34     ` Paul Eggert
2021-08-14 20:08       ` Martin Sebor
2021-08-18 15:53         ` [PATCH v2] " Martin Sebor
2021-08-18 19:52         ` [PATCH] " Paul Eggert
2021-08-19 23:50           ` [PATCH v3] " Martin Sebor
2021-08-22  5:06             ` Paul Eggert
2021-08-26 15:06               ` Martin Sebor
2021-08-26 16:24                 ` Paul Eggert
2021-08-26 16:47                   ` Martin Sebor
2021-08-27  8:52                     ` Paul Eggert
2021-08-27 16:34                       ` Martin Sebor
2021-08-27 17:50                         ` Allow #pragma GCC in headers in conformtest [committed] (was: Re: [PATCH v3] remove attribute access from regexec) Joseph Myers
2021-08-27 18:57                         ` [PATCH v3] remove attribute access from regexec Paul Eggert
2021-09-20 20:46                           ` Joseph Myers
2021-09-21  6:52                             ` Paul Eggert
2021-09-21 13:48                               ` Joseph Myers
2021-09-21 15:00                                 ` Paul Eggert
2021-10-19 16:39             ` Carlos O'Donell
2021-10-19 17:06               ` Martin Sebor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).