public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: Adhemerval Zanella <adhemerval.zanella@linaro.org>,
	Florian Weimer <fweimer@redhat.com>,
	libc-alpha@sourceware.org
Subject: Re: [PATCH] posix: Sync gnulib regex implementation
Date: Fri, 29 Jun 2018 23:10:00 -0000	[thread overview]
Message-ID: <09fa88fe-87e5-2c98-43e7-2dfeffabffbf@cs.ucla.edu> (raw)
In-Reply-To: <eafb6875-966a-02f7-40ca-50b8f00bef9a@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 2219 bytes --]

On 06/29/2018 09:00 AM, Adhemerval Zanella wrote:
> Paul, could you check what I am missing on the overflow check so we can
> move forward? I really don't want to get this sync stalled because of
> intprops.h addition as a pre-requisite.

I can add comments to  INT_ADD_WRAPV, if that would help. Please ask 
questions about it; that would help me understand what parts need better 
commenting.

I'm afraid that Florian's comment about testability is a bit of a red 
herring, as intprops.h is an internal header, not a published one. Glibc 
typically doesn't have test cases for macros defined in internal include 
files, and there's no need to test intprops.h directly. Besides, 
intprops.h and INT_ADD_WRAPV have been tested widely in commonly-used 
GNU applications like Coreutils, so it's not like I'm proposing anything 
flaky here.

That being said, this use of INT_ADD_WRAPV is unimportant (as far as I 
know, no Gnu apps use this two-buffer regex searching code any more), so 
I guess it's OK here to write a simple substitute that's good enough for 
regex even if it's not robust or general. The goal should be simplicity, 
portability, and meeting regex's needs. Proposed further Glibc patch 
attached (it assumes your patch). I've installed a similar patch into 
Gnulib.

I really don't want to get into the habit of doing this sort of thing, 
though. There's a lot of hard-won integer overflow expertise in 
intprops.h and it's a waste of time to continue to maintain poor 
imitations of it.


> I tried to check the pos-processed implementation using intprops.h for
> the snippet ... And its results are pretty much unreadable.
Trying to audit this by looking at the preprocessed output is a bit like 
auditing glibc by looking at the generated machine code. It can be done, 
but you can't expect it to be easy. For your particular compiler I 
expect that the assembly-language output is more readable than the 
preprocessor output, so if you want to look at low level stuff I suggest 
looking at assembly language. But really, you should be better off 
reading the original source (and if it needs more comments, please say 
where and why and l'll add them).



[-- Attachment #2: 0001-Merge-better-with-Gnulib.patch --]
[-- Type: text/x-patch, Size: 4137 bytes --]

From 614946f802c3a7c2b11be14f85d3ed80f5e7ab57 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 29 Jun 2018 15:42:16 -0700
Subject: [PATCH] Merge better with Gnulib

* posix/regcomp.c (TYPE_SIGNED): Remove; now done by regex_internal.h.
* posix/regex_internal.h [_LIBC]: Do not include intprops.h.
(TYPE_SIGNED, INT_ADD_WRAPV) [_LIBC]: New macros.
* posix/regexec.c (check_add_overflow_idx): Remove.
(re_search_2_stub): Use INT_ADD_WRAPV insead of check_add_overflow_idx.
---
 ChangeLog              |  9 +++++++++
 posix/regcomp.c        |  4 ----
 posix/regex_internal.h | 18 ++++++++++++++++++
 posix/regexec.c        | 18 +-----------------
 4 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index c51c651220..4e665873ff 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2018-06-29  Paul Eggert  <eggert@cs.ucla.edu>
+
+	Merge better with Gnulib
+	* posix/regcomp.c (TYPE_SIGNED): Remove; now done by regex_internal.h.
+	* posix/regex_internal.h [_LIBC]: Do not include intprops.h.
+	(TYPE_SIGNED, INT_ADD_WRAPV) [_LIBC]: New macros.
+	* posix/regexec.c (check_add_overflow_idx): Remove.
+	(re_search_2_stub): Use INT_ADD_WRAPV insead of check_add_overflow_idx.
+
 2018-06-29  Adhemerval Zanella  <adhemerval.zanella@linaro.org>
 
 	[BZ #23233]
diff --git a/posix/regcomp.c b/posix/regcomp.c
index 6aaa54f66d..7b5ddaad0c 100644
--- a/posix/regcomp.c
+++ b/posix/regcomp.c
@@ -2650,10 +2650,6 @@ parse_dup_op (bin_tree_t *elem, re_string_t *regexp, re_dfa_t *dfa,
   if (BE (tree == NULL, 0))
     goto parse_dup_op_espace;
 
-/* From gnulib's "intprops.h":
-   True if the arithmetic type T is signed.  */
-#define TYPE_SIGNED(t) (! ((t) 0 < (t) -1))
-
   /* This loop is actually executed only when end != -1,
      to rewrite <re>{0,n} as (<re>(<re>...<re>?)?)?...  We have
      already created the start+1-th copy.  */
diff --git a/posix/regex_internal.h b/posix/regex_internal.h
index f38f6b3d80..3b836ed206 100644
--- a/posix/regex_internal.h
+++ b/posix/regex_internal.h
@@ -33,6 +33,24 @@
 #include <stdbool.h>
 #include <stdint.h>
 
+/* Properties of integers.  Although Gnulib has intprops.h, glibc does
+   without for now.  */
+#ifndef _LIBC
+# include "intprops.h"
+#else
+/* True if the real type T is signed.  */
+# define TYPE_SIGNED(t) (! ((t) 0 < (t) -1))
+
+/* True if adding the nonnegative Idx values A and B would overflow.
+   If false, set *R to A + B.  A, B, and R may be evaluated more than
+   once, or zero times.  Although this is not a full implementation of
+   Gnulib INT_ADD_WRAPV, it is good enough for glibc regex code.
+   FIXME: This implementation is a fragile stopgap, and this file would
+   be simpler and more robust if intprops.h were migrated into glibc.  */
+# define INT_ADD_WRAPV(a, b, r) \
+   (IDX_MAX - (a) < (b) ? true : (*(r) = (a) + (b), false))
+#endif
+
 #ifdef _LIBC
 # include <libc-lock.h>
 # define lock_define(name) __libc_lock_define (, name)
diff --git a/posix/regexec.c b/posix/regexec.c
index d64b025d12..63aef97535 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -317,22 +317,6 @@ re_search_2 (struct re_pattern_buffer *bufp, const char *string1, Idx length1,
 weak_alias (__re_search_2, re_search_2)
 #endif
 
-/* Set *R = A + B.  Return true if the answer is mathematically incorrect due
-   to overflow; in this case, *R is the low order bits of the correct
-   answer.  */
-static inline bool
-check_add_overflow_idx (Idx a, Idx b, Idx *r)
-{
-#if __GNUC__ >= 5
-  return __builtin_add_overflow (a, b, r);
-#else
-  *r = a + b;
-  if (a < 0 && b < IDX_MAX -a)
-    return false;
-  return true;
-#endif
-}
-
 static regoff_t
 re_search_2_stub (struct re_pattern_buffer *bufp, const char *string1,
 		  Idx length1, const char *string2, Idx length2, Idx start,
@@ -345,7 +329,7 @@ re_search_2_stub (struct re_pattern_buffer *bufp, const char *string1,
   char *s = NULL;
 
   if (BE ((length1 < 0 || length2 < 0 || stop < 0
-           || check_add_overflow_idx (length1, length2, &len)),
+           || INT_ADD_WRAPV (length1, length2, &len)),
           0))
     return -2;
 
-- 
2.17.1


  reply	other threads:[~2018-06-29 23:10 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28 16:44 Adhemerval Zanella
2018-06-28 20:22 ` Paul Eggert
2018-06-29 13:48   ` Adhemerval Zanella
2018-06-29 15:26     ` Paul Eggert
2018-06-29 15:41       ` Florian Weimer
2018-06-29 16:01         ` Adhemerval Zanella
2018-06-29 23:10           ` Paul Eggert [this message]
2018-07-03 19:46             ` Adhemerval Zanella
2018-07-03 20:05               ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09fa88fe-87e5-2c98-43e7-2dfeffabffbf@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=adhemerval.zanella@linaro.org \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).