public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Re: [glibc] regex: copy back from Gnulib
       [not found] <20210921150048.9E05F3858C3A@sourceware.org>
@ 2021-09-21 21:44 ` Carlos O'Donell
  2021-09-22  1:06   ` Paul Eggert
  0 siblings, 1 reply; 8+ messages in thread
From: Carlos O'Donell @ 2021-09-21 21:44 UTC (permalink / raw)
  To: libc-alpha, Paul Eggert

On 9/21/21 11:00, Paul Eggert via Glibc-cvs wrote:
> commit 0b5ca7c3e551e5502f3be3b06453324fe8604e82
> Author: Paul Eggert <eggert@cs.ucla.edu>
> Date:   Tue Sep 21 07:47:45 2021 -0700
> 
>     regex: copy back from Gnulib
>     
>     Copy regex-related files back from Gnulib, to fix a problem with
>     static checking of regex calls noted by Martin Sebor.  This merges the
>     following changes:
>     
>     * New macro __attribute_nonnull__ in misc/sys/cdefs.h, for use later
>     when copying other files back from Gnulib.
>     
>     * Use __GNULIB_CDEFS instead of __GLIBC__ when deciding
>     whether to include bits/wordsize.h etc.
>     
>     * Avoid duplicate entries in epsilon closure table.
>     
>     * New regex.h macro _REGEX_NELTS to let regexec say that its pmatch
>     arg should contain nmatch elts.  Use that for regexec, instead of
>     __attr_access (which is incorrect).
>     
>     * New regex.h macro _Attr_access_ which is like __attr_access except
>     portable to non-glibc platforms.
>     
>     * Add some DEBUG_ASSERTs to pacify gcc -fanalyzer and to catch
>     recently-fixed performance bugs if they recur.
>     
>     * Add Gnulib-specific stuff to port the dynarray- and lock-using parts
>     of regex code to non-glibc platforms.
>     
>     * Fix glibc bug 11053.
>     
>     * Avoid some undefined behavior when popping an empty fail stack.
Paul,

In the future please post such changes for review on the list (I don't see it on
libc-alpha, patchwork, or see any Reviewed-by: tags).

I would like to review such changes as part of our normal code review. I think such
review adds belt-and-suspenders to the development process.

Could you please also update SHARED-FILES with a "# Merged from gnulib YYYY-MM-DD"
marker for updated files? I would like us to be more diligent in tracking the current
status of shared files so we know what still needs merging.

Note that patches posted for review also benefit from patchwork CI/CD.
This includes an auto-check for i686
e.g. https://patchwork.sourceware.org/project/glibc/patch/20210914054946.46372-1-eggert@cs.ucla.edu/
And CI/CD will get better over time.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [glibc] regex: copy back from Gnulib
  2021-09-21 21:44 ` [glibc] regex: copy back from Gnulib Carlos O'Donell
@ 2021-09-22  1:06   ` Paul Eggert
  2021-09-22 11:55     ` Joseph Myers
  2021-09-24 14:26     ` [glibc] regex: copy back from Gnulib Carlos O'Donell
  0 siblings, 2 replies; 8+ messages in thread
From: Paul Eggert @ 2021-09-22  1:06 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: libc-alpha

[-- Attachment #1: Type: text/plain, Size: 2450 bytes --]

On 9/21/21 2:44 PM, Carlos O'Donell wrote:

> In the future please post such changes for review on the list (I don't see it on
> libc-alpha, patchwork, or see any Reviewed-by: tags).

I posted it on libc-alpha a while ago; that's where Joseph picked it up.

https://sourceware.org/pipermail/libc-alpha/2021-August/130571.html


> I would like to review such changes as part of our normal code review.

I'd also like reviews. Unfortunately it's hard to get good regex 
reviews. The code is hairy and nobody really understands it, other than 
maybe its long-gone author.


> Could you please also update SHARED-FILES with a "# Merged from gnulib YYYY-MM-DD"
> marker for updated files?

Sure, done by committing the attached.

It is unfortunate that glibc has a file for this (SHARED-FILES) that I 
didn't know about until now, while Gnulib has its own file in a fancier 
format for the same thing (config/srclist.txt), so that I need to 
remember to update both. Plus, they surely disagree with each other. At 
some point I'll try to find time to iron out the discrepancies.


PS. When I first tried to commit the change I got the following 
offputting diagnostic. Is there any chance we could change the 
commit-message encoding from ISO-8895-15 to UTF-8? Requiring ISO-8895-15 
seems so 2nd-millennium; even downgrading to requiring ASCII would be 
better, as ASCII is UTF-8-compatible.

> 3-penguin $ git push
> Enumerating objects: 5, done.
> Counting objects: 100% (5/5), done.
> Delta compression using up to 4 threads
> Compressing objects: 100% (3/3), done.
> Writing objects: 100% (3/3), 459 bytes | 229.00 KiB/s, done.
> Total 3 (delta 2), reused 0 (delta 0), pack-reused 0
> remote: *** Invalid revision history for commit 3e45d44dd2922890aec56fbd415621cb0b56a95f:
> remote: *** It contains characters not in the ISO-8859-15 charset.
> remote: *** 
> remote: *** Below is the first line where this was detected (line 1):
> remote: *** | Mention today’s regex merge in SHARED-FILES
> remote: ***                ^
> remote: ***                |
> remote: *** 
> remote: *** Please amend the commit's revision history to remove it
> remote: *** and try again.
> remote: error: hook declined to update refs/heads/master
> To sourceware.org:/git/glibc.git
>  ! [remote rejected]       master -> master (hook declined)
> error: failed to push some refs to 'sourceware.org:/git/glibc.git'

[-- Attachment #2: 0001-Mention-today-s-regex-merge-in-SHARED-FILES.patch --]
[-- Type: text/x-patch, Size: 1635 bytes --]

From 3e45d44dd2922890aec56fbd415621cb0b56a95f Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 21 Sep 2021 17:53:13 -0700
Subject: [PATCH] =?UTF-8?q?Mention=20today=E2=80=99s=20regex=20merge=20in?=
 =?UTF-8?q?=20SHARED-FILES?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 SHARED-FILES | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/SHARED-FILES b/SHARED-FILES
index 38f189727c..03fa41a4fd 100644
--- a/SHARED-FILES
+++ b/SHARED-FILES
@@ -31,6 +31,10 @@ gnulib:
   crypt/md5.h
   dirent/alphasort.c
   dirent/scandir.c
+  # Merged from gnulib 2021-09-21
+  include/intprops.h
+  # Merged from gnulib 2021-09-21
+  include/regex.h
   locale/programs/3level.h
   # Merged from gnulib 2014-6-23
   malloc/obstack.c
@@ -41,6 +45,8 @@ gnulib:
   misc/error.h
   misc/getpass.c
   misc/mkdtemp.c
+  # Merged from gnulib 2021-09-21
+  misc/sys/cdefs.h
   posix/fnmatch_loop.c
   # Intended to be the same. Gnulib copy contains glibc changes.
   posix/getopt.c
@@ -49,11 +55,17 @@ gnulib:
   # Intended to be the same. Gnulib copy contains glibc changes.
   posix/getopt_int.h
   posix/glob.c
+  # Merged from gnulib 2021-09-21
   posix/regcomp.c
+  # Merged from gnulib 2021-09-21
   posix/regex.c
+  # Merged from gnulib 2021-09-21
   posix/regex.h
+  # Merged from gnulib 2021-09-21
   posix/regex_internal.c
+  # Merged from gnulib 2021-09-21
   posix/regex_internal.h
+  # Merged from gnulib 2021-09-21
   posix/regexec.c
   posix/spawn.c
   posix/spawn_faction_addclose.c
-- 
2.31.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [glibc] regex: copy back from Gnulib
  2021-09-22  1:06   ` Paul Eggert
@ 2021-09-22 11:55     ` Joseph Myers
  2021-09-22 21:07       ` [PATCH] Allow UTF-8 in glibc commit messages Paul Eggert
  2021-09-24 14:26     ` [glibc] regex: copy back from Gnulib Carlos O'Donell
  1 sibling, 1 reply; 8+ messages in thread
From: Joseph Myers @ 2021-09-22 11:55 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Carlos O'Donell, libc-alpha

On Tue, 21 Sep 2021, Paul Eggert wrote:

> PS. When I first tried to commit the change I got the following offputting
> diagnostic. Is there any chance we could change the commit-message encoding
> from ISO-8895-15 to UTF-8? Requiring ISO-8895-15 seems so 2nd-millennium; even
> downgrading to requiring ASCII would be better, as ASCII is UTF-8-compatible.

That's a matter of setting hooks.no-rh-character-range-check in 
refs/meta/config:project.config.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] Allow UTF-8 in glibc commit messages
  2021-09-22 11:55     ` Joseph Myers
@ 2021-09-22 21:07       ` Paul Eggert
  2021-09-24 14:08         ` Carlos O'Donell
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Eggert @ 2021-09-22 21:07 UTC (permalink / raw)
  To: Joseph Myers; +Cc: Carlos O'Donell, libc-alpha, Mike Frysinger

[-- Attachment #1: Type: text/plain, Size: 314 bytes --]

On 9/22/21 4:55 AM, Joseph Myers wrote:

> That's a matter of setting hooks.no-rh-character-range-check in
> refs/meta/config:project.config.

Thanks for pointing me at that setting, which I didn't know about. I now 
see that this issue is also glibc bug 27299. Proposed patch to the 
meta/config branch attached.

[-- Attachment #2: 0001-Allow-UTF-8-in-commit-messages-bug-27299.patch --]
[-- Type: text/x-patch, Size: 728 bytes --]

From e08caa9ead42af8bb3d1fb41cd18af59b3098616 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Wed, 22 Sep 2021 14:02:06 -0700
Subject: [PATCH] Allow UTF-8 in commit messages (bug 27299)

---
 project.config | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/project.config b/project.config
index 4284e9f9df..349bdde2db 100644
--- a/project.config
+++ b/project.config
@@ -7,6 +7,10 @@
         # debugging session, error messages, logs, etc.
         max-rh-line-length = 0
 
+	# Allow UTF-8 in commit messages, instead of requiring
+	# them to use ISO-8859-15.
+	no-rh-character-range-check = true
+
         # We allow a 5MiB email diff maximum.
         max-email-diff-size = 5242880
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Allow UTF-8 in glibc commit messages
  2021-09-22 21:07       ` [PATCH] Allow UTF-8 in glibc commit messages Paul Eggert
@ 2021-09-24 14:08         ` Carlos O'Donell
  2021-09-24 14:13           ` Carlos O'Donell
  0 siblings, 1 reply; 8+ messages in thread
From: Carlos O'Donell @ 2021-09-24 14:08 UTC (permalink / raw)
  To: Paul Eggert, Joseph Myers; +Cc: libc-alpha, Mike Frysinger

On 9/22/21 17:07, Paul Eggert wrote:
> On 9/22/21 4:55 AM, Joseph Myers wrote:
> 
>> That's a matter of setting hooks.no-rh-character-range-check in
>> refs/meta/config:project.config.
> 
> Thanks for pointing me at that setting, which I didn't know about. I now see that this issue is also glibc bug 27299. Proposed patch to the meta/config branch attached.

I'm going to integrate this right now.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Allow UTF-8 in glibc commit messages
  2021-09-24 14:08         ` Carlos O'Donell
@ 2021-09-24 14:13           ` Carlos O'Donell
  0 siblings, 0 replies; 8+ messages in thread
From: Carlos O'Donell @ 2021-09-24 14:13 UTC (permalink / raw)
  To: Paul Eggert, Joseph Myers; +Cc: libc-alpha, Mike Frysinger

On 9/24/21 10:08, Carlos O'Donell wrote:
> On 9/22/21 17:07, Paul Eggert wrote:
>> On 9/22/21 4:55 AM, Joseph Myers wrote:
>>
>>> That's a matter of setting hooks.no-rh-character-range-check in
>>> refs/meta/config:project.config.
>>
>> Thanks for pointing me at that setting, which I didn't know about. I now see that this issue is also glibc bug 27299. Proposed patch to the meta/config branch attached.
> 
> I'm going to integrate this right now.
> 

This is live now and so future commits should work.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [glibc] regex: copy back from Gnulib
  2021-09-22  1:06   ` Paul Eggert
  2021-09-22 11:55     ` Joseph Myers
@ 2021-09-24 14:26     ` Carlos O'Donell
  2021-09-24 21:34       ` Paul Eggert
  1 sibling, 1 reply; 8+ messages in thread
From: Carlos O'Donell @ 2021-09-24 14:26 UTC (permalink / raw)
  To: Paul Eggert; +Cc: libc-alpha

On 9/21/21 21:06, Paul Eggert wrote:
> On 9/21/21 2:44 PM, Carlos O'Donell wrote:
> 
>> In the future please post such changes for review on the list (I don't see it on
>> libc-alpha, patchwork, or see any Reviewed-by: tags).
> 
> I posted it on libc-alpha a while ago; that's where Joseph picked it up.
> 
> https://sourceware.org/pipermail/libc-alpha/2021-August/130571.html

This is an RFC under an unrelated subject thread.
 
>> I would like to review such changes as part of our normal code review.
> 
> I'd also like reviews. Unfortunately it's hard to get good regex reviews. The code is hairy and nobody really understands it, other than maybe its long-gone author.
 
I just spend more than a week fixing nrules==0 for fnmatch, regexp, and regcomp.

I would like to review how things are evolving there and I'd be happy to review.

I need to circle back and fix nrules==0 for both glibc and gnulib.

What I don't think we need to do is rush a merge from gnulib.

We have time until the release of glibc 2.35 and we can and should follow the
best engineering practices we have adopted.
 
>> Could you please also update SHARED-FILES with a "# Merged from gnulib YYYY-MM-DD"
>> marker for updated files?
> 
> Sure, done by committing the attached.

Thank you!

> It is unfortunate that glibc has a file for this (SHARED-FILES) that I didn't know about until now, while Gnulib has its own file in a fancier format for the same thing (config/srclist.txt), so that I need to remember to update both. Plus, they surely disagree with each other. At some point I'll try to find time to iron out the discrepancies.

You reviewed Siddhesh's patch to add SHARED-FILES on August 24th.

https://sourceware.org/pipermail/libc-alpha/2021-August/130455.html

The evolution of similar formats is just an indication of a common
requirement. I'm fine with two different file formats, that we can
eventually work to harmonize. That's just how FOSS evolves :-)

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [glibc] regex: copy back from Gnulib
  2021-09-24 14:26     ` [glibc] regex: copy back from Gnulib Carlos O'Donell
@ 2021-09-24 21:34       ` Paul Eggert
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Eggert @ 2021-09-24 21:34 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: libc-alpha

On 9/24/21 7:26 AM, Carlos O'Donell wrote:

>> I posted it on libc-alpha a while ago; that's where Joseph picked it up.
>>
>> https://sourceware.org/pipermail/libc-alpha/2021-August/130571.html
> 
> This is an RFC under an unrelated subject thread.

I'm afraid I don't quite parse what you wrote. I was trying to say that 
I did post the patch for review on libc-alpha. I guess I did it the 
wrong way?


> I just spend more than a week fixing nrules==0 for fnmatch, regexp, and regcomp.
> 
> I would like to review how things are evolving there and I'd be happy to review.
> 
> I need to circle back and fix nrules==0 for both glibc and gnulib.
> 
> What I don't think we need to do is rush a merge from gnulib.

These things are rarely rushed. Fixes have sometimes languished in 
Gnulib for months or even years. I didn't mean to rush it this time 
either; I installed into Glibc only because I was asked (to fix a build 
problem) and was given the OK.

I don't know what you mean by "fix nrules==0" but would be happy to test 
any patch you write, using Gnulib's testing framework for this stuff 
(yes, we have our own :-).

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-09-24 21:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210921150048.9E05F3858C3A@sourceware.org>
2021-09-21 21:44 ` [glibc] regex: copy back from Gnulib Carlos O'Donell
2021-09-22  1:06   ` Paul Eggert
2021-09-22 11:55     ` Joseph Myers
2021-09-22 21:07       ` [PATCH] Allow UTF-8 in glibc commit messages Paul Eggert
2021-09-24 14:08         ` Carlos O'Donell
2021-09-24 14:13           ` Carlos O'Donell
2021-09-24 14:26     ` [glibc] regex: copy back from Gnulib Carlos O'Donell
2021-09-24 21:34       ` Paul Eggert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).