public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Paul Eggert <eggert@cs.ucla.edu>
To: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: libc-alpha@sourceware.org, bug-gnulib@gnu.org
Subject: Re: [PATCH 0/8] Remove alloca usage from glob
Date: Wed, 13 Jan 2021 11:36:31 -0800	[thread overview]
Message-ID: <8f12a10d-74ec-aaf6-7512-8021fabee24a@cs.ucla.edu> (raw)
In-Reply-To: <20210105185820.3796657-1-adhemerval.zanella@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 1699 bytes --]

On 1/5/21 10:58 AM, Adhemerval Zanella wrote:
> The idea of removing the alloca allows a slight better code generation,
> simplifies the boilerplate code to avoid the unbounded alloca usage,
> and it plays better with security compiler mitigation tools (such as the
> ones for stack clash).

Instead of complicating dynarray by adding char_array stuff, I suggest 
adding any primitives just to glob for now, as it's not clear that 
they're generally useful.

I do see a problem with the proposed patch set, in that it creates 
several different dynarrays when glob really needs only one or two. The 
existing code is full of memory-allocation gotchas and would be 
simplified if it treated the memory it allocates as a first-class part 
of the problem rather than as some sort of cranky subsidiary that needs 
to be babied and its diaper changed at random intervals.

Attached is a draft set of patches against Gnulib commit 
6a00fdb4bb105697aa27ba97ef7ec33287790ad3 which gives a hint about the 
sort of thing that I mean here. This patch set is not complete (it does 
only the equivalent of the first four of the patches you proposed) and 
it's not exactly the form that I want, so I haven't installed it into 
Gnulib. However, I hope it shows the sort of thing I have in mind. So 
far, it's needed only one scratch buffer.

In this patch set, glob.c continues to use scratch_buffer.h because 
scratch buffers are good enough for all the changes needed so far. I 
suppose dynarrays will be helpful for later patches and that we can 
switch to them as needed, but I wanted to focus on the actual problem 
first rather than worrying about scratch buffers vs dynarrays.

[-- Attachment #2: 0001-glob-use-scratch_buffer-for-internal-glob-dirname.patch --]
[-- Type: text/x-patch, Size: 29786 bytes --]

From 7bd5987cf0ed6668d34a921866c94b112594e714 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 12 Jan 2021 17:24:17 -0800
Subject: [PATCH 1/3] glob: use scratch_buffer for internal glob dirname

This is an alternative implementation of a patch for glibc
proposed by Adhemerval Zanella in:
https://sourceware.org/pipermail/libc-alpha/2021-January/121344.html
* lib/glob.c (scratch_string): New static function.
(__glob): Just init and free a scratch buffer, and let glob_buf
do the real work.  This simplifies memory allocation cleanup.
(glob_buf): New  static function, like the old __glob but with
a new struct scratch_buffer arg.  Use the scratch buffer instead
of alloca/malloc for dirname, drive_spec [__MSDOS__ || WINDOWS32 only].
home_dir, user_name.  Use bool for internal booleans.
---
 ChangeLog  |  14 ++
 lib/glob.c | 432 +++++++++++++++++++----------------------------------
 2 files changed, 171 insertions(+), 275 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 1e589aac3..bf235ce49 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,17 @@
+2021-01-12  Paul Eggert  <eggert@cs.ucla.edu>
+
+	glob: use scratch_buffer for internal glob dirname
+	This is an alternative implementation of a patch for glibc
+	proposed by Adhemerval Zanella in:
+	https://sourceware.org/pipermail/libc-alpha/2021-January/121344.html
+	* lib/glob.c (scratch_string): New static function.
+	(__glob): Just init and free a scratch buffer, and let glob_buf
+	do the real work.  This simplifies memory allocation cleanup.
+	(glob_buf): New  static function, like the old __glob but with
+	a new struct scratch_buffer arg.  Use the scratch buffer instead
+	of alloca/malloc for dirname, drive_spec [__MSDOS__ || WINDOWS32 only].
+	home_dir, user_name.  Use bool for internal booleans.
+
 2021-01-08  Paul Eggert  <eggert@cs.ucla.edu>
 
 	dynarray: work even if ‘free’ is replaced
diff --git a/lib/glob.c b/lib/glob.c
index 32c88e5d1..44d2fdd5f 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -237,6 +237,8 @@ glob_use_alloca (size_t alloca_used, size_t len)
           && __libc_use_alloca (size));
 }
 
+static int glob_buf (char const *, int, int (*) (char const *, int), glob_t *,
+                     struct scratch_buffer *);
 static int glob_in_dir (const char *pattern, const char *directory,
                         int flags, int (*errfunc) (const char *, int),
                         glob_t *pglob, size_t alloca_used);
@@ -244,6 +246,20 @@ static int prefix_array (const char *prefix, char **array, size_t n) __THROWNL;
 static int collated_compare (const void *, const void *) __THROWNL;
 
 
+/* Using GLOBBUF for storage, return a string with contents equal to
+   BUF with length BUFLEN.  Terminate the string with a null byte.
+   Return NULL on memory allocation failure.  */
+static char *
+scratch_string (struct scratch_buffer *globbuf, char const *buf, size_t buflen)
+{
+  if (!scratch_buffer_set_array_size (globbuf, buflen + 1, 1))
+    return NULL;
+  char *copy = globbuf->data;
+  char *copy_end = mempcpy (copy, buf, buflen);
+  *copy_end = '\0';
+  return copy;
+}
+
 /* Return true if FILENAME is a directory or a symbolic link to a directory.
    Use FLAGS and PGLOB to resolve the filename.  */
 static bool
@@ -290,23 +306,32 @@ next_brace_sub (const char *cp, int flags)
    it is called with the pathname that caused the error, and the
    'errno' value from the failing call; if it returns non-zero
    'glob' returns GLOB_ABORTED; if it returns zero, the error is ignored.
-   If memory cannot be allocated for PGLOB, GLOB_NOSPACE is returned.
+   If memory cannot be allocated, return GLOB_NOSPACE.
    Otherwise, 'glob' returns zero.  */
 int
 GLOB_ATTRIBUTE
 __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
         glob_t *pglob)
+{
+  struct scratch_buffer globbuf;
+  scratch_buffer_init (&globbuf);
+  int result = glob_buf (pattern, flags, errfunc, pglob, &globbuf);
+  scratch_buffer_free (&globbuf);
+  return result;
+}
+
+/* Like 'glob', but with an additional scratch buffer arg GLOBBUF.  */
+static int
+glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
+          glob_t *pglob, struct scratch_buffer *globbuf)
 {
   const char *filename;
-  char *dirname = NULL;
+  char *dirname;
   size_t dirlen;
   int status;
   size_t oldcount;
   int meta;
-  int dirname_modified;
-  int malloc_dirname = 0;
   glob_t dirs;
-  int retval = 0;
   size_t alloca_used = 0;
 
   if (pattern == NULL || pglob == NULL || (flags & ~__GLOB_FLAGS) != 0)
@@ -492,20 +517,21 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
     filename = strchr (pattern, ':');
 #endif /* __MSDOS__ || WINDOWS32 */
 
-  dirname_modified = 0;
+  bool dirname_modified = false;
   if (filename == NULL)
     {
       /* This can mean two things: a simple name or "~name".  The latter
          case is nothing but a notation for a directory.  */
       if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && pattern[0] == '~')
         {
-          dirname = (char *) pattern;
           dirlen = strlen (pattern);
+          dirname = scratch_string (globbuf, pattern, dirlen);
+          if (dirname == NULL)
+            return GLOB_NOSPACE;
 
-          /* Set FILENAME to NULL as a special flag.  This is ugly but
+          /* Keep FILENAME == NULL as a special flag.  This is ugly but
              other solutions would require much more code.  We test for
              this special case below.  */
-          filename = NULL;
         }
       else
         {
@@ -516,7 +542,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             }
 
           filename = pattern;
-          dirname = (char *) ".";
+          dirname = scratch_string (globbuf, ".", 1);
           dirlen = 0;
         }
     }
@@ -525,47 +551,34 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                && (flags & GLOB_NOESCAPE) == 0))
     {
       /* "/pattern" or "\\/pattern".  */
-      dirname = (char *) "/";
+      dirname = scratch_string (globbuf, "/", 1);
       dirlen = 1;
       ++filename;
     }
   else
     {
-      char *newp;
       dirlen = filename - pattern;
+      dirname = scratch_string (globbuf, pattern, dirlen);
+      if (dirname == NULL)
+        return GLOB_NOSPACE;
+      ++filename;
+
 #if defined __MSDOS__ || defined WINDOWS32
-      if (*filename == ':'
-          || (filename > pattern + 1 && filename[-1] == ':'))
+      if (filename[-1] == ':' || (2 < dirlen && filename[-2] == ':'))
         {
-          char *drive_spec;
-
           ++dirlen;
-          drive_spec = __alloca (dirlen + 1);
-          *((char *) mempcpy (drive_spec, pattern, dirlen)) = '\0';
+          dirname = scratch_string (globbuf, pattern, dirlen);
+          if (dirname == NULL)
+            return GLOB_NOSPACE;
           /* For now, disallow wildcards in the drive spec, to
              prevent infinite recursion in glob.  */
-          if (__glob_pattern_p (drive_spec, !(flags & GLOB_NOESCAPE)))
+          if (__glob_pattern_p (dirname, !(flags & GLOB_NOESCAPE)))
             return GLOB_NOMATCH;
           /* If this is "d:pattern", we need to copy ':' to DIRNAME
              as well.  If it's "d:/pattern", don't remove the slash
              from "d:/", since "d:" and "d:/" are not the same.*/
         }
-#endif
 
-      if (glob_use_alloca (alloca_used, dirlen + 1))
-        newp = alloca_account (dirlen + 1, alloca_used);
-      else
-        {
-          newp = malloc (dirlen + 1);
-          if (newp == NULL)
-            return GLOB_NOSPACE;
-          malloc_dirname = 1;
-        }
-      *((char *) mempcpy (newp, pattern, dirlen)) = '\0';
-      dirname = newp;
-      ++filename;
-
-#if defined __MSDOS__ || defined WINDOWS32
       bool drive_root = (dirlen > 1
                          && (dirname[dirlen - 1] == ':'
                              || (dirlen > 2 && dirname[dirlen - 2] == ':'
@@ -582,12 +595,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             {
               /* "pattern\\/".  Remove the final backslash if it hasn't
                  been quoted.  */
-              char *p = (char *) &dirname[dirlen - 1];
+              char *p = &dirname[dirlen - 1];
 
               while (p > dirname && p[-1] == '\\') --p;
               if ((&dirname[dirlen] - p) & 1)
                 {
-                  *(char *) &dirname[--dirlen] = '\0';
+                  dirname[--dirlen] = '\0';
                   flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
                 }
             }
@@ -603,8 +616,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               oldcount = pglob->gl_pathc + pglob->gl_offs;
               goto no_matches;
             }
-          retval = val;
-          goto out;
+          return val;
         }
     }
 
@@ -615,8 +627,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               && (dirname[2] == '\0' || dirname[2] == '/')))
         {
           /* Look up home directory.  */
-          char *home_dir = getenv ("HOME");
-          int malloc_home_dir = 0;
+          char const *home_dir = getenv ("HOME");
           if (home_dir == NULL || home_dir[0] == '\0')
             {
 #ifdef WINDOWS32
@@ -629,8 +640,10 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                 {
                   size_t home_drive_len = strlen (home_drive);
                   size_t home_path_len = strlen (home_path);
-                  char *mem = alloca (home_drive_len + home_path_len + 1);
-
+                  size_t needed = home_drive_len + home_path_len + 1;
+                  if (!scratch_buffer_set_array_size (globbuf, needed, 1))
+                    return GLOB_NOSPACE;
+                  char *mem = globbuf->data;
                   memcpy (mem, home_drive, home_drive_len);
                   memcpy (mem + home_drive_len, home_path, home_path_len + 1);
                   home_dir = mem;
@@ -638,236 +651,128 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               else
                 home_dir = "c:/users/default"; /* poor default */
 #else
-              int err;
-              struct passwd *p;
-              struct passwd pwbuf;
-              struct scratch_buffer s;
-              scratch_buffer_init (&s);
               while (true)
                 {
-                  p = NULL;
-                  err = __getlogin_r (s.data, s.length);
+                  struct passwd *p = NULL;
+                  char *sdata = globbuf->data;
+                  size_t pwsize = globbuf->length / 2;
+                  size_t loginsize = globbuf->length - pwsize;
+                  int err = __getlogin_r (sdata, loginsize);
                   if (err == 0)
                     {
 # if defined HAVE_GETPWNAM_R || defined _LIBC
-                      size_t ssize = strlen (s.data) + 1;
-                      char *sdata = s.data;
-                      err = getpwnam_r (sdata, &pwbuf, sdata + ssize,
-                                        s.length - ssize, &p);
+                      struct passwd pwbuf;
+                      err = getpwnam_r (sdata, &pwbuf,
+                                        sdata + loginsize, pwsize, &p);
 # else
-                      p = getpwnam (s.data);
+                      errno = 0;
+                      p = getpwnam (sdata);
                       if (p == NULL)
                         err = errno;
+                      else if (globbuf->length <= strlen (p->pw_dir))
+                        {
+                          p = NULL;
+                          err = ERANGE;
+                        }
 # endif
+                      if (p != NULL)
+                        home_dir = strcpy (sdata, p->pw_dir);
                     }
                   if (err != ERANGE)
                     break;
-                  if (!scratch_buffer_grow (&s))
-                    {
-                      retval = GLOB_NOSPACE;
-                      goto out;
-                    }
-                }
-              if (err == 0)
-                {
-                  home_dir = strdup (p->pw_dir);
-                  malloc_home_dir = 1;
-                }
-              scratch_buffer_free (&s);
-              if (err == 0 && home_dir == NULL)
-                {
-                  retval = GLOB_NOSPACE;
-                  goto out;
+                  if (!scratch_buffer_grow (globbuf))
+                    return GLOB_NOSPACE;
                 }
 #endif /* WINDOWS32 */
             }
           if (home_dir == NULL || home_dir[0] == '\0')
             {
-              if (__glibc_unlikely (malloc_home_dir))
-                free (home_dir);
               if (flags & GLOB_TILDE_CHECK)
-                {
-                  retval = GLOB_NOMATCH;
-                  goto out;
-                }
-              else
-                {
-                  home_dir = (char *) "~"; /* No luck.  */
-                  malloc_home_dir = 0;
-                }
-            }
-          /* Now construct the full directory.  */
-          if (dirname[1] == '\0')
-            {
-              if (__glibc_unlikely (malloc_dirname))
-                free (dirname);
+                return GLOB_NOMATCH;
 
-              dirname = home_dir;
-              dirlen = strlen (dirname);
-              malloc_dirname = malloc_home_dir;
+              home_dir = "~"; /* No luck.  */
             }
-          else
-            {
-              char *newp;
-              size_t home_len = strlen (home_dir);
-              int use_alloca = glob_use_alloca (alloca_used, home_len + dirlen);
-              if (use_alloca)
-                newp = alloca_account (home_len + dirlen, alloca_used);
-              else
-                {
-                  newp = malloc (home_len + dirlen);
-                  if (newp == NULL)
-                    {
-                      if (__glibc_unlikely (malloc_home_dir))
-                        free (home_dir);
-                      retval = GLOB_NOSPACE;
-                      goto out;
-                    }
-                }
 
-              mempcpy (mempcpy (newp, home_dir, home_len),
-                       &dirname[1], dirlen);
-
-              if (__glibc_unlikely (malloc_dirname))
-                free (dirname);
-
-              dirname = newp;
-              dirlen += home_len - 1;
-              malloc_dirname = !use_alloca;
-
-              if (__glibc_unlikely (malloc_home_dir))
-                free (home_dir);
-            }
-          dirname_modified = 1;
+          /* Now construct the full directory.  */
+          size_t home_len = strlen (home_dir);
+          if (home_dir == globbuf->data)
+            home_dir = NULL;
+          while (globbuf->length < home_len + dirlen)
+            if (!scratch_buffer_grow_preserve (globbuf))
+              return GLOB_NOSPACE;
+          dirname = globbuf->data;
+          if (home_dir != NULL)
+            memcpy (dirname, home_dir, home_len);
+          char *dirend = mempcpy (dirname + home_len, &pattern[1], dirlen - 1);
+          *dirend = '\0';
+          dirlen += home_len - 1;
+          dirname_modified = true;
         }
       else
         {
 #ifndef WINDOWS32
           char *end_name = strchr (dirname, '/');
-          char *user_name;
-          int malloc_user_name = 0;
-          char *unescape = NULL;
-
-          if (!(flags & GLOB_NOESCAPE))
-            {
-              if (end_name == NULL)
-                {
-                  unescape = strchr (dirname, '\\');
-                  if (unescape)
-                    end_name = strchr (unescape, '\0');
-                }
-              else
-                unescape = memchr (dirname, '\\', end_name - dirname);
-            }
           if (end_name == NULL)
-            user_name = dirname + 1;
-          else
+            end_name = dirname + dirlen;
+          size_t end_offset = end_name - dirname;
+          size_t user_size = 0;
+          for (size_t i = 1; i < end_offset;
+               dirname[user_size++] = dirname[i++])
             {
-              char *newp;
-              if (glob_use_alloca (alloca_used, end_name - dirname))
-                newp = alloca_account (end_name - dirname, alloca_used);
-              else
+              if (! (flags & GLOB_NOESCAPE) && dirname[i] == '\\')
                 {
-                  newp = malloc (end_name - dirname);
-                  if (newp == NULL)
+                  i++;
+                  if (i == end_offset)
                     {
-                      retval = GLOB_NOSPACE;
-                      goto out;
+                      /* "~fo\\o\\" unescapes to user name "foo\\",
+                         but "~fo\\o\\/" unescapes to user name "foo".  */
+                      if (filename == NULL)
+                        dirname[user_size++] = '\\';
+                      break;
                     }
-                  malloc_user_name = 1;
                 }
-              if (unescape != NULL)
-                {
-                  char *p = mempcpy (newp, dirname + 1,
-                                     unescape - dirname - 1);
-                  char *q = unescape;
-                  while (q != end_name)
-                    {
-                      if (*q == '\\')
-                        {
-                          if (q + 1 == end_name)
-                            {
-                              /* "~fo\\o\\" unescape to user_name "foo\\",
-                                 but "~fo\\o\\/" unescape to user_name
-                                 "foo".  */
-                              if (filename == NULL)
-                                *p++ = '\\';
-                              break;
-                            }
-                          ++q;
-                        }
-                      *p++ = *q++;
-                    }
-                  *p = '\0';
-                }
-              else
-                *((char *) mempcpy (newp, dirname + 1, end_name - dirname - 1))
-                  = '\0';
-              user_name = newp;
             }
+          dirname[user_size++] = '\0';
 
           /* Look up specific user's home directory.  */
           {
             struct passwd *p;
-            struct scratch_buffer pwtmpbuf;
-            scratch_buffer_init (&pwtmpbuf);
+            ptrdiff_t home_offset = -1;
 
 #  if defined HAVE_GETPWNAM_R || defined _LIBC
             struct passwd pwbuf;
 
-            while (getpwnam_r (user_name, &pwbuf,
-                               pwtmpbuf.data, pwtmpbuf.length, &p)
+            while (getpwnam_r (dirname, &pwbuf, dirname + user_size,
+                               globbuf->length - user_size, &p)
                    == ERANGE)
               {
-                if (!scratch_buffer_grow (&pwtmpbuf))
-                  {
-                    retval = GLOB_NOSPACE;
-                    goto out;
-                  }
+                if (!scratch_buffer_grow_preserve (globbuf))
+                  return GLOB_NOSPACE;
+                dirname = globbuf->data;
               }
+            if (p != NULL)
+              home_offset = p->pw_dir - dirname;
 #  else
-            p = getpwnam (user_name);
+            p = getpwnam (dirname);
 #  endif
 
-            if (__glibc_unlikely (malloc_user_name))
-              free (user_name);
-
-            /* If we found a home directory use this.  */
             if (p != NULL)
               {
+                /* Use the home directory that we found.  */
                 size_t home_len = strlen (p->pw_dir);
-                size_t rest_len = end_name == NULL ? 0 : strlen (end_name);
-                /* dirname contains end_name; we can't free it now.  */
-                char *prev_dirname =
-                  (__glibc_unlikely (malloc_dirname) ? dirname : NULL);
-                char *d;
-
-                malloc_dirname = 0;
-
-                if (glob_use_alloca (alloca_used, home_len + rest_len + 1))
-                  dirname = alloca_account (home_len + rest_len + 1,
-                                            alloca_used);
-                else
-                  {
-                    dirname = malloc (home_len + rest_len + 1);
-                    if (dirname == NULL)
-                      {
-                        free (prev_dirname);
-                        scratch_buffer_free (&pwtmpbuf);
-                        retval = GLOB_NOSPACE;
-                        goto out;
-                      }
-                    malloc_dirname = 1;
-                  }
-                d = mempcpy (dirname, p->pw_dir, home_len);
-                if (end_name != NULL)
-                  d = mempcpy (d, end_name, rest_len);
-                *d = '\0';
-
-                free (prev_dirname);
-
+                size_t rest_len = dirlen - end_offset;
                 dirlen = home_len + rest_len;
-                dirname_modified = 1;
+                while (globbuf->length <= dirlen)
+                  if (!scratch_buffer_grow_preserve (globbuf))
+                    return GLOB_NOSPACE;
+                dirname = globbuf->data;
+                char *home = (home_offset < 0 ? p->pw_dir
+                              : dirname + home_offset);
+                memmove (dirname, home, home_len);
+                char *dirend = mempcpy (dirname + home_len,
+                                        pattern + end_offset, rest_len);
+                *dirend = '\0';
+                dirname_modified = true;
               }
             else
               {
@@ -875,11 +780,12 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                   {
                     /* We have to regard it as an error if we cannot find the
                        home directory.  */
-                    retval = GLOB_NOMATCH;
-                    goto out;
+                    return GLOB_NOMATCH;
                   }
+
+                /* Restore the original tilde pattern.  */
+                dirname = scratch_string (globbuf, pattern, dirlen);
               }
-            scratch_buffer_free (&pwtmpbuf);
           }
 #endif /* !WINDOWS32 */
         }
@@ -898,8 +804,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           free (pglob->gl_pathv);
           pglob->gl_pathv = NULL;
           pglob->gl_pathc = 0;
-          retval = GLOB_NOSPACE;
-          goto out;
+          return GLOB_NOSPACE;
         }
 
       new_gl_pathv = realloc (pglob->gl_pathv,
@@ -910,27 +815,20 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
 
       if (flags & GLOB_MARK && is_dir (dirname, flags, pglob))
         {
-          char *p;
-          pglob->gl_pathv[newcount] = malloc (dirlen + 2);
-          if (pglob->gl_pathv[newcount] == NULL)
-            goto nospace;
-          p = mempcpy (pglob->gl_pathv[newcount], dirname, dirlen);
-          p[0] = '/';
-          p[1] = '\0';
-          if (__glibc_unlikely (malloc_dirname))
-            free (dirname);
-        }
-      else
-        {
-          if (__glibc_unlikely (malloc_dirname))
-            pglob->gl_pathv[newcount] = dirname;
-          else
+          if (dirlen + 1 == globbuf->length)
             {
-              pglob->gl_pathv[newcount] = strdup (dirname);
-              if (pglob->gl_pathv[newcount] == NULL)
+              if (!scratch_buffer_grow_preserve (globbuf))
                 goto nospace;
+              dirname = globbuf->data;
             }
+          dirname[dirlen++] = '/';
+          dirname[dirlen] = '\0';
         }
+      pglob->gl_pathv[newcount] = scratch_buffer_dupfree (globbuf, dirlen + 1);
+      if (pglob->gl_pathv[newcount] == NULL)
+        goto nospace;
+      scratch_buffer_init (globbuf);
+
       pglob->gl_pathv[++newcount] = NULL;
       ++pglob->gl_pathc;
       pglob->gl_flags = flags;
@@ -955,11 +853,11 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
         {
           /* "foo\\/bar".  Remove the final backslash from dirname
              if it has not been quoted.  */
-          char *p = (char *) &dirname[dirlen - 1];
+          char *p = &dirname[dirlen - 1];
 
           while (p > dirname && p[-1] == '\\') --p;
-          if ((&dirname[dirlen] - p) & 1)
-            *(char *) &dirname[--dirlen] = '\0';
+          dirlen -= (&dirname[dirlen] - p) & 1;
+          dirname[dirlen] = '\0';
         }
 
       if (__glibc_unlikely ((flags & GLOB_ALTDIRFUNC) != 0))
@@ -980,10 +878,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
       if (status != 0)
         {
           if ((flags & GLOB_NOCHECK) == 0 || status != GLOB_NOMATCH)
-            {
-              retval = status;
-              goto out;
-            }
+            return status;
           goto no_matches;
         }
 
@@ -1008,8 +903,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               globfree (&dirs);
               globfree (pglob);
               pglob->gl_pathc = 0;
-              retval = status;
-              goto out;
+              return status;
             }
 
           /* Stick the directory on the front of each name.  */
@@ -1020,8 +914,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               globfree (&dirs);
               globfree (pglob);
               pglob->gl_pathc = 0;
-              retval = GLOB_NOSPACE;
-              goto out;
+              return GLOB_NOSPACE;
             }
         }
 
@@ -1043,8 +936,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                 {
                 nospace2:
                   globfree (&dirs);
-                  retval = GLOB_NOSPACE;
-                  goto out;
+                  return GLOB_NOSPACE;
                 }
 
               new_gl_pathv = realloc (pglob->gl_pathv,
@@ -1059,8 +951,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
                   globfree (&dirs);
                   globfree (pglob);
                   pglob->gl_pathc = 0;
-                  retval = GLOB_NOSPACE;
-                  goto out;
+                  return GLOB_NOSPACE;
                 }
 
               ++pglob->gl_pathc;
@@ -1072,8 +963,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
           else
             {
               globfree (&dirs);
-              retval = GLOB_NOMATCH;
-              goto out;
+              return GLOB_NOMATCH;
             }
         }
 
@@ -1086,10 +976,8 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
 
       if (meta & GLOBPAT_BACKSLASH)
         {
+          /* Unescape DIRNAME.  */
           char *p = strchr (dirname, '\\'), *q;
-          /* We need to unescape the dirname string.  It is certainly
-             allocated by alloca, as otherwise filename would be NULL
-             or dirname wouldn't contain backslashes.  */
           q = p;
           do
             {
@@ -1103,7 +991,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               ++q;
             }
           while (*p++ != '\0');
-          dirname_modified = 1;
+          dirname_modified = true;
         }
       if (dirname_modified)
         flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
@@ -1119,8 +1007,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               flags = orig_flags;
               goto no_matches;
             }
-          retval = status;
-          goto out;
+          return status;
         }
 
       if (dirlen > 0)
@@ -1132,8 +1019,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
             {
               globfree (pglob);
               pglob->gl_pathc = 0;
-              retval = GLOB_NOSPACE;
-              goto out;
+              return GLOB_NOSPACE;
             }
         }
     }
@@ -1152,8 +1038,7 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
               {
                 globfree (pglob);
                 pglob->gl_pathc = 0;
-                retval = GLOB_NOSPACE;
-                goto out;
+                return GLOB_NOSPACE;
               }
             strcpy (&new[len - 2], "/");
             pglob->gl_pathv[i] = new;
@@ -1168,12 +1053,9 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int),
              sizeof (char *), collated_compare);
     }
 
- out:
-  if (__glibc_unlikely (malloc_dirname))
-    free (dirname);
-
-  return retval;
+  return 0;
 }
+
 #if defined _LIBC && !defined __glob
 versioned_symbol (libc, __glob, glob, GLIBC_2_27);
 libc_hidden_ver (__glob, glob)
-- 
2.27.0


[-- Attachment #3: 0002-glob-use-scratch_buffer-for-GLOB_BRACE.patch --]
[-- Type: text/x-patch, Size: 5491 bytes --]

From eaaf408c6dce7108a4ab47c9fad5009dda64bc04 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 12 Jan 2021 20:59:38 -0800
Subject: [PATCH 2/3] glob: use scratch_buffer for GLOB_BRACE

This is an alternative implementation of a patch for glibc
proposed by Adhemerval Zanella in:
https://sourceware.org/pipermail/libc-alpha/2021-January/121345.html
* lib/glob.c (glob_buf): Calculate pattern length at start.
Use GLOBBUF for temporaries when analyzing brace expressions.
---
 ChangeLog  |  7 +++++++
 lib/glob.c | 40 +++++++++++++++++-----------------------
 2 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index bf235ce49..e0a168e7b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,12 @@
 2021-01-12  Paul Eggert  <eggert@cs.ucla.edu>
 
+	glob: use scratch_buffer for GLOB_BRACE
+	This is an alternative implementation of a patch for glibc
+	proposed by Adhemerval Zanella in:
+	https://sourceware.org/pipermail/libc-alpha/2021-January/121345.html
+	* lib/glob.c (glob_buf): Calculate pattern length at start.
+	Use GLOBBUF for temporaries when analyzing brace expressions.
+
 	glob: use scratch_buffer for internal glob dirname
 	This is an alternative implementation of a patch for glibc
 	proposed by Adhemerval Zanella in:
diff --git a/lib/glob.c b/lib/glob.c
index 44d2fdd5f..304baebf6 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -342,7 +342,8 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
 
   /* POSIX requires all slashes to be matched.  This means that with
      a trailing slash we must match only directories.  */
-  if (pattern[0] && pattern[strlen (pattern) - 1] == '/')
+  size_t pattern_len = strlen (pattern);
+  if (pattern_len != 0 && pattern[pattern_len - 1] == '/')
     flags |= GLOB_ONLYDIR;
 
   if (!(flags & GLOB_DOOFFS))
@@ -409,16 +410,13 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
           const char *rest;
           size_t rest_len;
           char *onealt;
-          size_t pattern_len = strlen (pattern) - 1;
-          int alloca_onealt = glob_use_alloca (alloca_used, pattern_len);
-          if (alloca_onealt)
-            onealt = alloca_account (pattern_len, alloca_used);
-          else
-            {
-              onealt = malloc (pattern_len);
-              if (onealt == NULL)
-                return GLOB_NOSPACE;
-            }
+
+          /* Allocate room for the pattern with any sub-pattern
+             subtituted for the brace expression.  For simplicity, use
+             the pattern length; this is more than enough room.  */
+          if (!scratch_buffer_set_array_size (globbuf, pattern_len, 1))
+            return GLOB_NOSPACE;
+          onealt = globbuf->data;
 
           /* We know the prefix for all sub-patterns.  */
           alt_start = mempcpy (onealt, pattern, begin - pattern);
@@ -429,9 +427,6 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
           if (next == NULL)
             {
               /* It is an invalid expression.  */
-            illegal_brace:
-              if (__glibc_unlikely (!alloca_onealt))
-                free (onealt);
               flags &= ~GLOB_BRACE;
               goto no_brace;
             }
@@ -442,12 +437,16 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
             {
               rest = next_brace_sub (rest + 1, flags);
               if (rest == NULL)
-                /* It is an illegal expression.  */
-                goto illegal_brace;
+                {
+                  /* It is an invalid expression.  */
+                  flags &= ~GLOB_BRACE;
+                  goto no_brace;
+                }
             }
           /* Please note that we now can be sure the brace expression
              is well-formed.  */
-          rest_len = strlen (++rest) + 1;
+          rest_len = pattern + pattern_len - rest;
+          rest++;
 
           /* We have a brace expression.  BEGIN points to the opening {,
              NEXT points past the terminator of the first element, and END
@@ -472,8 +471,6 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
               /* If we got an error, return it.  */
               if (result && result != GLOB_NOMATCH)
                 {
-                  if (__glibc_unlikely (!alloca_onealt))
-                    free (onealt);
                   if (!(flags & GLOB_APPEND))
                     {
                       globfree (pglob);
@@ -491,9 +488,6 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
               assert (next != NULL);
             }
 
-          if (__glibc_unlikely (!alloca_onealt))
-            free (onealt);
-
           if (pglob->gl_pathc != firstc)
             /* We found some entries.  */
             return 0;
@@ -524,7 +518,7 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
          case is nothing but a notation for a directory.  */
       if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && pattern[0] == '~')
         {
-          dirlen = strlen (pattern);
+          dirlen = pattern_len;
           dirname = scratch_string (globbuf, pattern, dirlen);
           if (dirname == NULL)
             return GLOB_NOSPACE;
-- 
2.27.0


[-- Attachment #4: 0003-glob-use-scratch_buffer-for-glob_in_dir-GLOBPAT_NONE.patch --]
[-- Type: text/x-patch, Size: 5498 bytes --]

From 2a6bd9769d8eca95d63365bc8131b2ebc88294e6 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 12 Jan 2021 22:31:35 -0800
Subject: [PATCH 3/3] glob: use scratch_buffer for glob_in_dir GLOBPAT_NONE

This is an alternative implementation of a patch for glibc
proposed by Adhemerval Zanella in:
https://sourceware.org/pipermail/libc-alpha/2021-January/121347.html
* lib/glob.c (glob_in_dir): Replace DIRECTORY arg with GLOBBUG +
DIRLEN args, to make it easy to append to it.  All callers
changed.
---
 ChangeLog  |  8 ++++++++
 lib/glob.c | 41 +++++++++++++++++------------------------
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index e0a168e7b..2cab81b60 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,13 @@
 2021-01-12  Paul Eggert  <eggert@cs.ucla.edu>
 
+	glob: use scratch_buffer for glob_in_dir GLOBPAT_NONE
+	This is an alternative implementation of a patch for glibc
+	proposed by Adhemerval Zanella in:
+	https://sourceware.org/pipermail/libc-alpha/2021-January/121347.html
+	* lib/glob.c (glob_in_dir): Replace DIRECTORY arg with GLOBBUG +
+	DIRLEN args, to make it easy to append to it.  All callers
+	changed.
+
 	glob: use scratch_buffer for GLOB_BRACE
 	This is an alternative implementation of a patch for glibc
 	proposed by Adhemerval Zanella in:
diff --git a/lib/glob.c b/lib/glob.c
index 304baebf6..f3c229035 100644
--- a/lib/glob.c
+++ b/lib/glob.c
@@ -239,7 +239,8 @@ glob_use_alloca (size_t alloca_used, size_t len)
 
 static int glob_buf (char const *, int, int (*) (char const *, int), glob_t *,
                      struct scratch_buffer *);
-static int glob_in_dir (const char *pattern, const char *directory,
+static int glob_in_dir (const char *pattern,
+                        struct scratch_buffer *globbuf, size_t dirlen,
                         int flags, int (*errfunc) (const char *, int),
                         glob_t *pglob, size_t alloca_used);
 static int prefix_array (const char *prefix, char **array, size_t n) __THROWNL;
@@ -884,7 +885,9 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
           size_t old_pathc;
 
           old_pathc = pglob->gl_pathc;
-          status = glob_in_dir (filename, dirs.gl_pathv[i],
+          size_t dirs_i_len = strlen (dirs.gl_pathv[i]);
+          scratch_string (globbuf, dirs.gl_pathv[i], dirs_i_len);
+          status = glob_in_dir (filename, globbuf, dirs_i_len,
                                 ((flags | GLOB_APPEND)
                                  & ~(GLOB_NOCHECK | GLOB_NOMAGIC)),
                                 errfunc, pglob, alloca_used);
@@ -989,7 +992,7 @@ glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, int),
         }
       if (dirname_modified)
         flags &= ~(GLOB_NOCHECK | GLOB_NOMAGIC);
-      status = glob_in_dir (filename, dirname, flags, errfunc, pglob,
+      status = glob_in_dir (filename, globbuf, dirlen, flags, errfunc, pglob,
                             alloca_used);
       if (status != 0)
         {
@@ -1128,15 +1131,16 @@ prefix_array (const char *dirname, char **array, size_t n)
 }
 
 /* Like 'glob', but PATTERN is a final pathname component,
-   and matches are searched for in DIRECTORY.
+   and matches are searched for in the directory name in GLOBBUF.
    The GLOB_NOSORT bit in FLAGS is ignored.  No sorting is ever done.
    The GLOB_APPEND flag is assumed to be set (always appends).  */
 static int
-glob_in_dir (const char *pattern, const char *directory, int flags,
+glob_in_dir (const char *pattern, struct scratch_buffer *globbuf,
+             size_t dirlen, int flags,
              int (*errfunc) (const char *, int),
              glob_t *pglob, size_t alloca_used)
 {
-  size_t dirlen = strlen (directory);
+  char *directory = globbuf->data;
   void *stream = NULL;
 # define GLOBNAMES_MEMBERS(nnames) \
     struct globnames *next; size_t count; char *name[nnames];
@@ -1169,31 +1173,20 @@ glob_in_dir (const char *pattern, const char *directory, int flags,
   else if (meta == GLOBPAT_NONE)
     {
       size_t patlen = strlen (pattern);
-      size_t fullsize;
-      bool alloca_fullname
-        = (! size_add_wrapv (dirlen + 1, patlen + 1, &fullsize)
-           && glob_use_alloca (alloca_used, fullsize));
-      char *fullname;
-      if (alloca_fullname)
-        fullname = alloca_account (fullsize, alloca_used);
-      else
+      while (globbuf->length - dirlen < patlen + 2)
         {
-          fullname = malloc (fullsize);
-          if (fullname == NULL)
+          if (!scratch_buffer_grow_preserve (globbuf))
             return GLOB_NOSPACE;
+          directory = globbuf->data;
         }
-
-      mempcpy (mempcpy (mempcpy (fullname, directory, dirlen),
-                        "/", 1),
-               pattern, patlen + 1);
-      if (glob_lstat (pglob, flags, fullname) == 0
+      directory[dirlen] = '/';
+      strcpy (directory + dirlen + 1, pattern);
+      if (glob_lstat (pglob, flags, directory) == 0
           || errno == EOVERFLOW)
         /* We found this file to be existing.  Now tell the rest
            of the function to copy this name into the result.  */
         flags |= GLOB_NOCHECK;
-
-      if (__glibc_unlikely (!alloca_fullname))
-        free (fullname);
+      directory[dirlen] = '\0';
     }
   else
     {
-- 
2.27.0


      parent reply	other threads:[~2021-01-13 19:36 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-05 18:58 Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 1/8] malloc: Add specialized dynarray for C strings Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 2/8] posix: Use char_array for internal glob dirname Adhemerval Zanella
2021-03-23 16:08   ` Arjun Shankar
2021-03-24 17:39     ` Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 3/8] posix: Remove alloca usage for GLOB_BRACE on glob Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 4/8] posix: Remove alloca usage on glob dirname Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 5/8] posix: Use dynarray for globname in glob Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 6/8] posix: Remove alloca usage on glob user_name Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 7/8] posix: Use char_array for home_dir in glob Adhemerval Zanella
2021-01-05 18:58 ` [PATCH 8/8] posix: Remove all alloca usage " Adhemerval Zanella
2021-01-13 19:36 ` Paul Eggert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8f12a10d-74ec-aaf6-7512-8021fabee24a@cs.ucla.edu \
    --to=eggert@cs.ucla.edu \
    --cc=adhemerval.zanella@linaro.org \
    --cc=bug-gnulib@gnu.org \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).