From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from zimbra.cs.ucla.edu (zimbra.cs.ucla.edu [131.179.128.68]) by sourceware.org (Postfix) with ESMTPS id 8FC5B386F80B for ; Wed, 13 Jan 2021 19:36:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8FC5B386F80B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=cs.ucla.edu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=eggert@cs.ucla.edu Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 071A7160139; Wed, 13 Jan 2021 11:36:35 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 6QqRKdYN6kSG; Wed, 13 Jan 2021 11:36:31 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 9F4FE16013E; Wed, 13 Jan 2021 11:36:31 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id JefMmWvf_evF; Wed, 13 Jan 2021 11:36:31 -0800 (PST) Received: from [192.168.1.9] (cpe-23-243-218-95.socal.res.rr.com [23.243.218.95]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 5508A160139; Wed, 13 Jan 2021 11:36:31 -0800 (PST) To: Adhemerval Zanella References: <20210105185820.3796657-1-adhemerval.zanella@linaro.org> From: Paul Eggert Organization: UCLA Computer Science Department Cc: libc-alpha@sourceware.org, bug-gnulib@gnu.org Subject: Re: [PATCH 0/8] Remove alloca usage from glob Message-ID: <8f12a10d-74ec-aaf6-7512-8021fabee24a@cs.ucla.edu> Date: Wed, 13 Jan 2021 11:36:31 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <20210105185820.3796657-1-adhemerval.zanella@linaro.org> Content-Type: multipart/mixed; boundary="------------FB3EE2282CCB683C9ACD6144" Content-Language: en-US X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP, T_FILL_THIS_FORM_SHORT autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Jan 2021 19:36:46 -0000 This is a multi-part message in MIME format. --------------FB3EE2282CCB683C9ACD6144 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable On 1/5/21 10:58 AM, Adhemerval Zanella wrote: > The idea of removing the alloca allows a slight better code generation, > simplifies the boilerplate code to avoid the unbounded alloca usage, > and it plays better with security compiler mitigation tools (such as th= e > ones for stack clash). Instead of complicating dynarray by adding char_array stuff, I suggest=20 adding any primitives just to glob for now, as it's not clear that=20 they're generally useful. I do see a problem with the proposed patch set, in that it creates=20 several different dynarrays when glob really needs only one or two. The=20 existing code is full of memory-allocation gotchas and would be=20 simplified if it treated the memory it allocates as a first-class part=20 of the problem rather than as some sort of cranky subsidiary that needs=20 to be babied and its diaper changed at random intervals. Attached is a draft set of patches against Gnulib commit=20 6a00fdb4bb105697aa27ba97ef7ec33287790ad3 which gives a hint about the=20 sort of thing that I mean here. This patch set is not complete (it does=20 only the equivalent of the first four of the patches you proposed) and=20 it's not exactly the form that I want, so I haven't installed it into=20 Gnulib. However, I hope it shows the sort of thing I have in mind. So=20 far, it's needed only one scratch buffer. In this patch set, glob.c continues to use scratch_buffer.h because=20 scratch buffers are good enough for all the changes needed so far. I=20 suppose dynarrays will be helpful for later patches and that we can=20 switch to them as needed, but I wanted to focus on the actual problem=20 first rather than worrying about scratch buffers vs dynarrays. --------------FB3EE2282CCB683C9ACD6144 Content-Type: text/x-patch; charset=UTF-8; name="0001-glob-use-scratch_buffer-for-internal-glob-dirname.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename*0="0001-glob-use-scratch_buffer-for-internal-glob-dirname.patch" =46rom 7bd5987cf0ed6668d34a921866c94b112594e714 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Tue, 12 Jan 2021 17:24:17 -0800 Subject: [PATCH 1/3] glob: use scratch_buffer for internal glob dirname This is an alternative implementation of a patch for glibc proposed by Adhemerval Zanella in: https://sourceware.org/pipermail/libc-alpha/2021-January/121344.html * lib/glob.c (scratch_string): New static function. (__glob): Just init and free a scratch buffer, and let glob_buf do the real work. This simplifies memory allocation cleanup. (glob_buf): New static function, like the old __glob but with a new struct scratch_buffer arg. Use the scratch buffer instead of alloca/malloc for dirname, drive_spec [__MSDOS__ || WINDOWS32 only]. home_dir, user_name. Use bool for internal booleans. --- ChangeLog | 14 ++ lib/glob.c | 432 +++++++++++++++++++---------------------------------- 2 files changed, 171 insertions(+), 275 deletions(-) diff --git a/ChangeLog b/ChangeLog index 1e589aac3..bf235ce49 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,17 @@ +2021-01-12 Paul Eggert + + glob: use scratch_buffer for internal glob dirname + This is an alternative implementation of a patch for glibc + proposed by Adhemerval Zanella in: + https://sourceware.org/pipermail/libc-alpha/2021-January/121344.html + * lib/glob.c (scratch_string): New static function. + (__glob): Just init and free a scratch buffer, and let glob_buf + do the real work. This simplifies memory allocation cleanup. + (glob_buf): New static function, like the old __glob but with + a new struct scratch_buffer arg. Use the scratch buffer instead + of alloca/malloc for dirname, drive_spec [__MSDOS__ || WINDOWS32 only].= + home_dir, user_name. Use bool for internal booleans. + 2021-01-08 Paul Eggert =20 dynarray: work even if =E2=80=98free=E2=80=99 is replaced diff --git a/lib/glob.c b/lib/glob.c index 32c88e5d1..44d2fdd5f 100644 --- a/lib/glob.c +++ b/lib/glob.c @@ -237,6 +237,8 @@ glob_use_alloca (size_t alloca_used, size_t len) && __libc_use_alloca (size)); } =20 +static int glob_buf (char const *, int, int (*) (char const *, int), glo= b_t *, + struct scratch_buffer *); static int glob_in_dir (const char *pattern, const char *directory, int flags, int (*errfunc) (const char *, int), glob_t *pglob, size_t alloca_used); @@ -244,6 +246,20 @@ static int prefix_array (const char *prefix, char **= array, size_t n) __THROWNL; static int collated_compare (const void *, const void *) __THROWNL; =20 =20 +/* Using GLOBBUF for storage, return a string with contents equal to + BUF with length BUFLEN. Terminate the string with a null byte. + Return NULL on memory allocation failure. */ +static char * +scratch_string (struct scratch_buffer *globbuf, char const *buf, size_t = buflen) +{ + if (!scratch_buffer_set_array_size (globbuf, buflen + 1, 1)) + return NULL; + char *copy =3D globbuf->data; + char *copy_end =3D mempcpy (copy, buf, buflen); + *copy_end =3D '\0'; + return copy; +} + /* Return true if FILENAME is a directory or a symbolic link to a direct= ory. Use FLAGS and PGLOB to resolve the filename. */ static bool @@ -290,23 +306,32 @@ next_brace_sub (const char *cp, int flags) it is called with the pathname that caused the error, and the 'errno' value from the failing call; if it returns non-zero 'glob' returns GLOB_ABORTED; if it returns zero, the error is ignored= =2E - If memory cannot be allocated for PGLOB, GLOB_NOSPACE is returned. + If memory cannot be allocated, return GLOB_NOSPACE. Otherwise, 'glob' returns zero. */ int GLOB_ATTRIBUTE __glob (const char *pattern, int flags, int (*errfunc) (const char *, in= t), glob_t *pglob) +{ + struct scratch_buffer globbuf; + scratch_buffer_init (&globbuf); + int result =3D glob_buf (pattern, flags, errfunc, pglob, &globbuf); + scratch_buffer_free (&globbuf); + return result; +} + +/* Like 'glob', but with an additional scratch buffer arg GLOBBUF. */ +static int +glob_buf (const char *pattern, int flags, int (*errfunc) (const char *, = int), + glob_t *pglob, struct scratch_buffer *globbuf) { const char *filename; - char *dirname =3D NULL; + char *dirname; size_t dirlen; int status; size_t oldcount; int meta; - int dirname_modified; - int malloc_dirname =3D 0; glob_t dirs; - int retval =3D 0; size_t alloca_used =3D 0; =20 if (pattern =3D=3D NULL || pglob =3D=3D NULL || (flags & ~__GLOB_FLAGS= ) !=3D 0) @@ -492,20 +517,21 @@ __glob (const char *pattern, int flags, int (*errfu= nc) (const char *, int), filename =3D strchr (pattern, ':'); #endif /* __MSDOS__ || WINDOWS32 */ =20 - dirname_modified =3D 0; + bool dirname_modified =3D false; if (filename =3D=3D NULL) { /* This can mean two things: a simple name or "~name". The latter= case is nothing but a notation for a directory. */ if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && pattern[0] =3D=3D '= ~') { - dirname =3D (char *) pattern; dirlen =3D strlen (pattern); + dirname =3D scratch_string (globbuf, pattern, dirlen); + if (dirname =3D=3D NULL) + return GLOB_NOSPACE; =20 - /* Set FILENAME to NULL as a special flag. This is ugly but + /* Keep FILENAME =3D=3D NULL as a special flag. This is ugly = but other solutions would require much more code. We test for this special case below. */ - filename =3D NULL; } else { @@ -516,7 +542,7 @@ __glob (const char *pattern, int flags, int (*errfunc= ) (const char *, int), } =20 filename =3D pattern; - dirname =3D (char *) "."; + dirname =3D scratch_string (globbuf, ".", 1); dirlen =3D 0; } } @@ -525,47 +551,34 @@ __glob (const char *pattern, int flags, int (*errfu= nc) (const char *, int), && (flags & GLOB_NOESCAPE) =3D=3D 0)) { /* "/pattern" or "\\/pattern". */ - dirname =3D (char *) "/"; + dirname =3D scratch_string (globbuf, "/", 1); dirlen =3D 1; ++filename; } else { - char *newp; dirlen =3D filename - pattern; + dirname =3D scratch_string (globbuf, pattern, dirlen); + if (dirname =3D=3D NULL) + return GLOB_NOSPACE; + ++filename; + #if defined __MSDOS__ || defined WINDOWS32 - if (*filename =3D=3D ':' - || (filename > pattern + 1 && filename[-1] =3D=3D ':')) + if (filename[-1] =3D=3D ':' || (2 < dirlen && filename[-2] =3D=3D = ':')) { - char *drive_spec; - ++dirlen; - drive_spec =3D __alloca (dirlen + 1); - *((char *) mempcpy (drive_spec, pattern, dirlen)) =3D '\0'; + dirname =3D scratch_string (globbuf, pattern, dirlen); + if (dirname =3D=3D NULL) + return GLOB_NOSPACE; /* For now, disallow wildcards in the drive spec, to prevent infinite recursion in glob. */ - if (__glob_pattern_p (drive_spec, !(flags & GLOB_NOESCAPE))) + if (__glob_pattern_p (dirname, !(flags & GLOB_NOESCAPE))) return GLOB_NOMATCH; /* If this is "d:pattern", we need to copy ':' to DIRNAME as well. If it's "d:/pattern", don't remove the slash from "d:/", since "d:" and "d:/" are not the same.*/ } -#endif =20 - if (glob_use_alloca (alloca_used, dirlen + 1)) - newp =3D alloca_account (dirlen + 1, alloca_used); - else - { - newp =3D malloc (dirlen + 1); - if (newp =3D=3D NULL) - return GLOB_NOSPACE; - malloc_dirname =3D 1; - } - *((char *) mempcpy (newp, pattern, dirlen)) =3D '\0'; - dirname =3D newp; - ++filename; - -#if defined __MSDOS__ || defined WINDOWS32 bool drive_root =3D (dirlen > 1 && (dirname[dirlen - 1] =3D=3D ':' || (dirlen > 2 && dirname[dirlen - 2] =3D=3D= ':' @@ -582,12 +595,12 @@ __glob (const char *pattern, int flags, int (*errfu= nc) (const char *, int), { /* "pattern\\/". Remove the final backslash if it hasn't been quoted. */ - char *p =3D (char *) &dirname[dirlen - 1]; + char *p =3D &dirname[dirlen - 1]; =20 while (p > dirname && p[-1] =3D=3D '\\') --p; if ((&dirname[dirlen] - p) & 1) { - *(char *) &dirname[--dirlen] =3D '\0'; + dirname[--dirlen] =3D '\0'; flags &=3D ~(GLOB_NOCHECK | GLOB_NOMAGIC); } } @@ -603,8 +616,7 @@ __glob (const char *pattern, int flags, int (*errfunc= ) (const char *, int), oldcount =3D pglob->gl_pathc + pglob->gl_offs; goto no_matches; } - retval =3D val; - goto out; + return val; } } =20 @@ -615,8 +627,7 @@ __glob (const char *pattern, int flags, int (*errfunc= ) (const char *, int), && (dirname[2] =3D=3D '\0' || dirname[2] =3D=3D '/'))) { /* Look up home directory. */ - char *home_dir =3D getenv ("HOME"); - int malloc_home_dir =3D 0; + char const *home_dir =3D getenv ("HOME"); if (home_dir =3D=3D NULL || home_dir[0] =3D=3D '\0') { #ifdef WINDOWS32 @@ -629,8 +640,10 @@ __glob (const char *pattern, int flags, int (*errfun= c) (const char *, int), { size_t home_drive_len =3D strlen (home_drive); size_t home_path_len =3D strlen (home_path); - char *mem =3D alloca (home_drive_len + home_path_len += 1); - + size_t needed =3D home_drive_len + home_path_len + 1; + if (!scratch_buffer_set_array_size (globbuf, needed, 1= )) + return GLOB_NOSPACE; + char *mem =3D globbuf->data; memcpy (mem, home_drive, home_drive_len); memcpy (mem + home_drive_len, home_path, home_path_len= + 1); home_dir =3D mem; @@ -638,236 +651,128 @@ __glob (const char *pattern, int flags, int (*err= func) (const char *, int), else home_dir =3D "c:/users/default"; /* poor default */ #else - int err; - struct passwd *p; - struct passwd pwbuf; - struct scratch_buffer s; - scratch_buffer_init (&s); while (true) { - p =3D NULL; - err =3D __getlogin_r (s.data, s.length); + struct passwd *p =3D NULL; + char *sdata =3D globbuf->data; + size_t pwsize =3D globbuf->length / 2; + size_t loginsize =3D globbuf->length - pwsize; + int err =3D __getlogin_r (sdata, loginsize); if (err =3D=3D 0) { # if defined HAVE_GETPWNAM_R || defined _LIBC - size_t ssize =3D strlen (s.data) + 1; - char *sdata =3D s.data; - err =3D getpwnam_r (sdata, &pwbuf, sdata + ssize, - s.length - ssize, &p); + struct passwd pwbuf; + err =3D getpwnam_r (sdata, &pwbuf, + sdata + loginsize, pwsize, &p); # else - p =3D getpwnam (s.data); + errno =3D 0; + p =3D getpwnam (sdata); if (p =3D=3D NULL) err =3D errno; + else if (globbuf->length <=3D strlen (p->pw_dir)) + { + p =3D NULL; + err =3D ERANGE; + } # endif + if (p !=3D NULL) + home_dir =3D strcpy (sdata, p->pw_dir); } if (err !=3D ERANGE) break; - if (!scratch_buffer_grow (&s)) - { - retval =3D GLOB_NOSPACE; - goto out; - } - } - if (err =3D=3D 0) - { - home_dir =3D strdup (p->pw_dir); - malloc_home_dir =3D 1; - } - scratch_buffer_free (&s); - if (err =3D=3D 0 && home_dir =3D=3D NULL) - { - retval =3D GLOB_NOSPACE; - goto out; + if (!scratch_buffer_grow (globbuf)) + return GLOB_NOSPACE; } #endif /* WINDOWS32 */ } if (home_dir =3D=3D NULL || home_dir[0] =3D=3D '\0') { - if (__glibc_unlikely (malloc_home_dir)) - free (home_dir); if (flags & GLOB_TILDE_CHECK) - { - retval =3D GLOB_NOMATCH; - goto out; - } - else - { - home_dir =3D (char *) "~"; /* No luck. */ - malloc_home_dir =3D 0; - } - } - /* Now construct the full directory. */ - if (dirname[1] =3D=3D '\0') - { - if (__glibc_unlikely (malloc_dirname)) - free (dirname); + return GLOB_NOMATCH; =20 - dirname =3D home_dir; - dirlen =3D strlen (dirname); - malloc_dirname =3D malloc_home_dir; + home_dir =3D "~"; /* No luck. */ } - else - { - char *newp; - size_t home_len =3D strlen (home_dir); - int use_alloca =3D glob_use_alloca (alloca_used, home_len = + dirlen); - if (use_alloca) - newp =3D alloca_account (home_len + dirlen, alloca_used)= ; - else - { - newp =3D malloc (home_len + dirlen); - if (newp =3D=3D NULL) - { - if (__glibc_unlikely (malloc_home_dir)) - free (home_dir); - retval =3D GLOB_NOSPACE; - goto out; - } - } =20 - mempcpy (mempcpy (newp, home_dir, home_len), - &dirname[1], dirlen); - - if (__glibc_unlikely (malloc_dirname)) - free (dirname); - - dirname =3D newp; - dirlen +=3D home_len - 1; - malloc_dirname =3D !use_alloca; - - if (__glibc_unlikely (malloc_home_dir)) - free (home_dir); - } - dirname_modified =3D 1; + /* Now construct the full directory. */ + size_t home_len =3D strlen (home_dir); + if (home_dir =3D=3D globbuf->data) + home_dir =3D NULL; + while (globbuf->length < home_len + dirlen) + if (!scratch_buffer_grow_preserve (globbuf)) + return GLOB_NOSPACE; + dirname =3D globbuf->data; + if (home_dir !=3D NULL) + memcpy (dirname, home_dir, home_len); + char *dirend =3D mempcpy (dirname + home_len, &pattern[1], dir= len - 1); + *dirend =3D '\0'; + dirlen +=3D home_len - 1; + dirname_modified =3D true; } else { #ifndef WINDOWS32 char *end_name =3D strchr (dirname, '/'); - char *user_name; - int malloc_user_name =3D 0; - char *unescape =3D NULL; - - if (!(flags & GLOB_NOESCAPE)) - { - if (end_name =3D=3D NULL) - { - unescape =3D strchr (dirname, '\\'); - if (unescape) - end_name =3D strchr (unescape, '\0'); - } - else - unescape =3D memchr (dirname, '\\', end_name - dirname);= - } if (end_name =3D=3D NULL) - user_name =3D dirname + 1; - else + end_name =3D dirname + dirlen; + size_t end_offset =3D end_name - dirname; + size_t user_size =3D 0; + for (size_t i =3D 1; i < end_offset; + dirname[user_size++] =3D dirname[i++]) { - char *newp; - if (glob_use_alloca (alloca_used, end_name - dirname)) - newp =3D alloca_account (end_name - dirname, alloca_used= ); - else + if (! (flags & GLOB_NOESCAPE) && dirname[i] =3D=3D '\\') { - newp =3D malloc (end_name - dirname); - if (newp =3D=3D NULL) + i++; + if (i =3D=3D end_offset) { - retval =3D GLOB_NOSPACE; - goto out; + /* "~fo\\o\\" unescapes to user name "foo\\", + but "~fo\\o\\/" unescapes to user name "foo". = */ + if (filename =3D=3D NULL) + dirname[user_size++] =3D '\\'; + break; } - malloc_user_name =3D 1; } - if (unescape !=3D NULL) - { - char *p =3D mempcpy (newp, dirname + 1, - unescape - dirname - 1); - char *q =3D unescape; - while (q !=3D end_name) - { - if (*q =3D=3D '\\') - { - if (q + 1 =3D=3D end_name) - { - /* "~fo\\o\\" unescape to user_name "foo\\= ", - but "~fo\\o\\/" unescape to user_name - "foo". */ - if (filename =3D=3D NULL) - *p++ =3D '\\'; - break; - } - ++q; - } - *p++ =3D *q++; - } - *p =3D '\0'; - } - else - *((char *) mempcpy (newp, dirname + 1, end_name - dirnam= e - 1)) - =3D '\0'; - user_name =3D newp; } + dirname[user_size++] =3D '\0'; =20 /* Look up specific user's home directory. */ { struct passwd *p; - struct scratch_buffer pwtmpbuf; - scratch_buffer_init (&pwtmpbuf); + ptrdiff_t home_offset =3D -1; =20 # if defined HAVE_GETPWNAM_R || defined _LIBC struct passwd pwbuf; =20 - while (getpwnam_r (user_name, &pwbuf, - pwtmpbuf.data, pwtmpbuf.length, &p) + while (getpwnam_r (dirname, &pwbuf, dirname + user_size, + globbuf->length - user_size, &p) =3D=3D ERANGE) { - if (!scratch_buffer_grow (&pwtmpbuf)) - { - retval =3D GLOB_NOSPACE; - goto out; - } + if (!scratch_buffer_grow_preserve (globbuf)) + return GLOB_NOSPACE; + dirname =3D globbuf->data; } + if (p !=3D NULL) + home_offset =3D p->pw_dir - dirname; # else - p =3D getpwnam (user_name); + p =3D getpwnam (dirname); # endif =20 - if (__glibc_unlikely (malloc_user_name)) - free (user_name); - - /* If we found a home directory use this. */ if (p !=3D NULL) { + /* Use the home directory that we found. */ size_t home_len =3D strlen (p->pw_dir); - size_t rest_len =3D end_name =3D=3D NULL ? 0 : strlen (e= nd_name); - /* dirname contains end_name; we can't free it now. */ - char *prev_dirname =3D - (__glibc_unlikely (malloc_dirname) ? dirname : NULL); - char *d; - - malloc_dirname =3D 0; - - if (glob_use_alloca (alloca_used, home_len + rest_len + = 1)) - dirname =3D alloca_account (home_len + rest_len + 1, - alloca_used); - else - { - dirname =3D malloc (home_len + rest_len + 1); - if (dirname =3D=3D NULL) - { - free (prev_dirname); - scratch_buffer_free (&pwtmpbuf); - retval =3D GLOB_NOSPACE; - goto out; - } - malloc_dirname =3D 1; - } - d =3D mempcpy (dirname, p->pw_dir, home_len); - if (end_name !=3D NULL) - d =3D mempcpy (d, end_name, rest_len); - *d =3D '\0'; - - free (prev_dirname); - + size_t rest_len =3D dirlen - end_offset; dirlen =3D home_len + rest_len; - dirname_modified =3D 1; + while (globbuf->length <=3D dirlen) + if (!scratch_buffer_grow_preserve (globbuf)) + return GLOB_NOSPACE; + dirname =3D globbuf->data; + char *home =3D (home_offset < 0 ? p->pw_dir + : dirname + home_offset); + memmove (dirname, home, home_len); + char *dirend =3D mempcpy (dirname + home_len, + pattern + end_offset, rest_len);= + *dirend =3D '\0'; + dirname_modified =3D true; } else { @@ -875,11 +780,12 @@ __glob (const char *pattern, int flags, int (*errfu= nc) (const char *, int), { /* We have to regard it as an error if we cannot fin= d the home directory. */ - retval =3D GLOB_NOMATCH; - goto out; + return GLOB_NOMATCH; } + + /* Restore the original tilde pattern. */ + dirname =3D scratch_string (globbuf, pattern, dirlen); } - scratch_buffer_free (&pwtmpbuf); } #endif /* !WINDOWS32 */ } @@ -898,8 +804,7 @@ __glob (const char *pattern, int flags, int (*errfunc= ) (const char *, int), free (pglob->gl_pathv); pglob->gl_pathv =3D NULL; pglob->gl_pathc =3D 0; - retval =3D GLOB_NOSPACE; - goto out; + return GLOB_NOSPACE; } =20 new_gl_pathv =3D realloc (pglob->gl_pathv, @@ -910,27 +815,20 @@ __glob (const char *pattern, int flags, int (*errfu= nc) (const char *, int), =20 if (flags & GLOB_MARK && is_dir (dirname, flags, pglob)) { - char *p; - pglob->gl_pathv[newcount] =3D malloc (dirlen + 2); - if (pglob->gl_pathv[newcount] =3D=3D NULL) - goto nospace; - p =3D mempcpy (pglob->gl_pathv[newcount], dirname, dirlen); - p[0] =3D '/'; - p[1] =3D '\0'; - if (__glibc_unlikely (malloc_dirname)) - free (dirname); - } - else - { - if (__glibc_unlikely (malloc_dirname)) - pglob->gl_pathv[newcount] =3D dirname; - else + if (dirlen + 1 =3D=3D globbuf->length) { - pglob->gl_pathv[newcount] =3D strdup (dirname); - if (pglob->gl_pathv[newcount] =3D=3D NULL) + if (!scratch_buffer_grow_preserve (globbuf)) goto nospace; + dirname =3D globbuf->data; } + dirname[dirlen++] =3D '/'; + dirname[dirlen] =3D '\0'; } + pglob->gl_pathv[newcount] =3D scratch_buffer_dupfree (globbuf, dir= len + 1); + if (pglob->gl_pathv[newcount] =3D=3D NULL) + goto nospace; + scratch_buffer_init (globbuf); + pglob->gl_pathv[++newcount] =3D NULL; ++pglob->gl_pathc; pglob->gl_flags =3D flags; @@ -955,11 +853,11 @@ __glob (const char *pattern, int flags, int (*errfu= nc) (const char *, int), { /* "foo\\/bar". Remove the final backslash from dirname if it has not been quoted. */ - char *p =3D (char *) &dirname[dirlen - 1]; + char *p =3D &dirname[dirlen - 1]; =20 while (p > dirname && p[-1] =3D=3D '\\') --p; - if ((&dirname[dirlen] - p) & 1) - *(char *) &dirname[--dirlen] =3D '\0'; + dirlen -=3D (&dirname[dirlen] - p) & 1; + dirname[dirlen] =3D '\0'; } =20 if (__glibc_unlikely ((flags & GLOB_ALTDIRFUNC) !=3D 0)) @@ -980,10 +878,7 @@ __glob (const char *pattern, int flags, int (*errfun= c) (const char *, int), if (status !=3D 0) { if ((flags & GLOB_NOCHECK) =3D=3D 0 || status !=3D GLOB_NOMATC= H) - { - retval =3D status; - goto out; - } + return status; goto no_matches; } =20 @@ -1008,8 +903,7 @@ __glob (const char *pattern, int flags, int (*errfun= c) (const char *, int), globfree (&dirs); globfree (pglob); pglob->gl_pathc =3D 0; - retval =3D status; - goto out; + return status; } =20 /* Stick the directory on the front of each name. */ @@ -1020,8 +914,7 @@ __glob (const char *pattern, int flags, int (*errfun= c) (const char *, int), globfree (&dirs); globfree (pglob); pglob->gl_pathc =3D 0; - retval =3D GLOB_NOSPACE; - goto out; + return GLOB_NOSPACE; } } =20 @@ -1043,8 +936,7 @@ __glob (const char *pattern, int flags, int (*errfun= c) (const char *, int), { nospace2: globfree (&dirs); - retval =3D GLOB_NOSPACE; - goto out; + return GLOB_NOSPACE; } =20 new_gl_pathv =3D realloc (pglob->gl_pathv, @@ -1059,8 +951,7 @@ __glob (const char *pattern, int flags, int (*errfun= c) (const char *, int), globfree (&dirs); globfree (pglob); pglob->gl_pathc =3D 0; - retval =3D GLOB_NOSPACE; - goto out; + return GLOB_NOSPACE; } =20 ++pglob->gl_pathc; @@ -1072,8 +963,7 @@ __glob (const char *pattern, int flags, int (*errfun= c) (const char *, int), else { globfree (&dirs); - retval =3D GLOB_NOMATCH; - goto out; + return GLOB_NOMATCH; } } =20 @@ -1086,10 +976,8 @@ __glob (const char *pattern, int flags, int (*errfu= nc) (const char *, int), =20 if (meta & GLOBPAT_BACKSLASH) { + /* Unescape DIRNAME. */ char *p =3D strchr (dirname, '\\'), *q; - /* We need to unescape the dirname string. It is certainly - allocated by alloca, as otherwise filename would be NULL - or dirname wouldn't contain backslashes. */ q =3D p; do { @@ -1103,7 +991,7 @@ __glob (const char *pattern, int flags, int (*errfun= c) (const char *, int), ++q; } while (*p++ !=3D '\0'); - dirname_modified =3D 1; + dirname_modified =3D true; } if (dirname_modified) flags &=3D ~(GLOB_NOCHECK | GLOB_NOMAGIC); @@ -1119,8 +1007,7 @@ __glob (const char *pattern, int flags, int (*errfu= nc) (const char *, int), flags =3D orig_flags; goto no_matches; } - retval =3D status; - goto out; + return status; } =20 if (dirlen > 0) @@ -1132,8 +1019,7 @@ __glob (const char *pattern, int flags, int (*errfu= nc) (const char *, int), { globfree (pglob); pglob->gl_pathc =3D 0; - retval =3D GLOB_NOSPACE; - goto out; + return GLOB_NOSPACE; } } } @@ -1152,8 +1038,7 @@ __glob (const char *pattern, int flags, int (*errfu= nc) (const char *, int), { globfree (pglob); pglob->gl_pathc =3D 0; - retval =3D GLOB_NOSPACE; - goto out; + return GLOB_NOSPACE; } strcpy (&new[len - 2], "/"); pglob->gl_pathv[i] =3D new; @@ -1168,12 +1053,9 @@ __glob (const char *pattern, int flags, int (*errf= unc) (const char *, int), sizeof (char *), collated_compare); } =20 - out: - if (__glibc_unlikely (malloc_dirname)) - free (dirname); - - return retval; + return 0; } + #if defined _LIBC && !defined __glob versioned_symbol (libc, __glob, glob, GLIBC_2_27); libc_hidden_ver (__glob, glob) --=20 2.27.0 --------------FB3EE2282CCB683C9ACD6144 Content-Type: text/x-patch; charset=UTF-8; name="0002-glob-use-scratch_buffer-for-GLOB_BRACE.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0002-glob-use-scratch_buffer-for-GLOB_BRACE.patch" =46rom eaaf408c6dce7108a4ab47c9fad5009dda64bc04 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Tue, 12 Jan 2021 20:59:38 -0800 Subject: [PATCH 2/3] glob: use scratch_buffer for GLOB_BRACE This is an alternative implementation of a patch for glibc proposed by Adhemerval Zanella in: https://sourceware.org/pipermail/libc-alpha/2021-January/121345.html * lib/glob.c (glob_buf): Calculate pattern length at start. Use GLOBBUF for temporaries when analyzing brace expressions. --- ChangeLog | 7 +++++++ lib/glob.c | 40 +++++++++++++++++----------------------- 2 files changed, 24 insertions(+), 23 deletions(-) diff --git a/ChangeLog b/ChangeLog index bf235ce49..e0a168e7b 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,12 @@ 2021-01-12 Paul Eggert =20 + glob: use scratch_buffer for GLOB_BRACE + This is an alternative implementation of a patch for glibc + proposed by Adhemerval Zanella in: + https://sourceware.org/pipermail/libc-alpha/2021-January/121345.html + * lib/glob.c (glob_buf): Calculate pattern length at start. + Use GLOBBUF for temporaries when analyzing brace expressions. + glob: use scratch_buffer for internal glob dirname This is an alternative implementation of a patch for glibc proposed by Adhemerval Zanella in: diff --git a/lib/glob.c b/lib/glob.c index 44d2fdd5f..304baebf6 100644 --- a/lib/glob.c +++ b/lib/glob.c @@ -342,7 +342,8 @@ glob_buf (const char *pattern, int flags, int (*errfu= nc) (const char *, int), =20 /* POSIX requires all slashes to be matched. This means that with a trailing slash we must match only directories. */ - if (pattern[0] && pattern[strlen (pattern) - 1] =3D=3D '/') + size_t pattern_len =3D strlen (pattern); + if (pattern_len !=3D 0 && pattern[pattern_len - 1] =3D=3D '/') flags |=3D GLOB_ONLYDIR; =20 if (!(flags & GLOB_DOOFFS)) @@ -409,16 +410,13 @@ glob_buf (const char *pattern, int flags, int (*err= func) (const char *, int), const char *rest; size_t rest_len; char *onealt; - size_t pattern_len =3D strlen (pattern) - 1; - int alloca_onealt =3D glob_use_alloca (alloca_used, pattern_le= n); - if (alloca_onealt) - onealt =3D alloca_account (pattern_len, alloca_used); - else - { - onealt =3D malloc (pattern_len); - if (onealt =3D=3D NULL) - return GLOB_NOSPACE; - } + + /* Allocate room for the pattern with any sub-pattern + subtituted for the brace expression. For simplicity, use + the pattern length; this is more than enough room. */ + if (!scratch_buffer_set_array_size (globbuf, pattern_len, 1)) + return GLOB_NOSPACE; + onealt =3D globbuf->data; =20 /* We know the prefix for all sub-patterns. */ alt_start =3D mempcpy (onealt, pattern, begin - pattern); @@ -429,9 +427,6 @@ glob_buf (const char *pattern, int flags, int (*errfu= nc) (const char *, int), if (next =3D=3D NULL) { /* It is an invalid expression. */ - illegal_brace: - if (__glibc_unlikely (!alloca_onealt)) - free (onealt); flags &=3D ~GLOB_BRACE; goto no_brace; } @@ -442,12 +437,16 @@ glob_buf (const char *pattern, int flags, int (*err= func) (const char *, int), { rest =3D next_brace_sub (rest + 1, flags); if (rest =3D=3D NULL) - /* It is an illegal expression. */ - goto illegal_brace; + { + /* It is an invalid expression. */ + flags &=3D ~GLOB_BRACE; + goto no_brace; + } } /* Please note that we now can be sure the brace expression is well-formed. */ - rest_len =3D strlen (++rest) + 1; + rest_len =3D pattern + pattern_len - rest; + rest++; =20 /* We have a brace expression. BEGIN points to the opening {,= NEXT points past the terminator of the first element, and E= ND @@ -472,8 +471,6 @@ glob_buf (const char *pattern, int flags, int (*errfu= nc) (const char *, int), /* If we got an error, return it. */ if (result && result !=3D GLOB_NOMATCH) { - if (__glibc_unlikely (!alloca_onealt)) - free (onealt); if (!(flags & GLOB_APPEND)) { globfree (pglob); @@ -491,9 +488,6 @@ glob_buf (const char *pattern, int flags, int (*errfu= nc) (const char *, int), assert (next !=3D NULL); } =20 - if (__glibc_unlikely (!alloca_onealt)) - free (onealt); - if (pglob->gl_pathc !=3D firstc) /* We found some entries. */ return 0; @@ -524,7 +518,7 @@ glob_buf (const char *pattern, int flags, int (*errfu= nc) (const char *, int), case is nothing but a notation for a directory. */ if ((flags & (GLOB_TILDE|GLOB_TILDE_CHECK)) && pattern[0] =3D=3D '= ~') { - dirlen =3D strlen (pattern); + dirlen =3D pattern_len; dirname =3D scratch_string (globbuf, pattern, dirlen); if (dirname =3D=3D NULL) return GLOB_NOSPACE; --=20 2.27.0 --------------FB3EE2282CCB683C9ACD6144 Content-Type: text/x-patch; charset=UTF-8; name="0003-glob-use-scratch_buffer-for-glob_in_dir-GLOBPAT_NONE.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename*0="0003-glob-use-scratch_buffer-for-glob_in_dir-GLOBPAT_NONE.pa"; filename*1="tch" =46rom 2a6bd9769d8eca95d63365bc8131b2ebc88294e6 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Tue, 12 Jan 2021 22:31:35 -0800 Subject: [PATCH 3/3] glob: use scratch_buffer for glob_in_dir GLOBPAT_NON= E This is an alternative implementation of a patch for glibc proposed by Adhemerval Zanella in: https://sourceware.org/pipermail/libc-alpha/2021-January/121347.html * lib/glob.c (glob_in_dir): Replace DIRECTORY arg with GLOBBUG + DIRLEN args, to make it easy to append to it. All callers changed. --- ChangeLog | 8 ++++++++ lib/glob.c | 41 +++++++++++++++++------------------------ 2 files changed, 25 insertions(+), 24 deletions(-) diff --git a/ChangeLog b/ChangeLog index e0a168e7b..2cab81b60 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,5 +1,13 @@ 2021-01-12 Paul Eggert =20 + glob: use scratch_buffer for glob_in_dir GLOBPAT_NONE + This is an alternative implementation of a patch for glibc + proposed by Adhemerval Zanella in: + https://sourceware.org/pipermail/libc-alpha/2021-January/121347.html + * lib/glob.c (glob_in_dir): Replace DIRECTORY arg with GLOBBUG + + DIRLEN args, to make it easy to append to it. All callers + changed. + glob: use scratch_buffer for GLOB_BRACE This is an alternative implementation of a patch for glibc proposed by Adhemerval Zanella in: diff --git a/lib/glob.c b/lib/glob.c index 304baebf6..f3c229035 100644 --- a/lib/glob.c +++ b/lib/glob.c @@ -239,7 +239,8 @@ glob_use_alloca (size_t alloca_used, size_t len) =20 static int glob_buf (char const *, int, int (*) (char const *, int), glo= b_t *, struct scratch_buffer *); -static int glob_in_dir (const char *pattern, const char *directory, +static int glob_in_dir (const char *pattern, + struct scratch_buffer *globbuf, size_t dirlen, int flags, int (*errfunc) (const char *, int), glob_t *pglob, size_t alloca_used); static int prefix_array (const char *prefix, char **array, size_t n) __T= HROWNL; @@ -884,7 +885,9 @@ glob_buf (const char *pattern, int flags, int (*errfu= nc) (const char *, int), size_t old_pathc; =20 old_pathc =3D pglob->gl_pathc; - status =3D glob_in_dir (filename, dirs.gl_pathv[i], + size_t dirs_i_len =3D strlen (dirs.gl_pathv[i]); + scratch_string (globbuf, dirs.gl_pathv[i], dirs_i_len); + status =3D glob_in_dir (filename, globbuf, dirs_i_len, ((flags | GLOB_APPEND) & ~(GLOB_NOCHECK | GLOB_NOMAGIC)), errfunc, pglob, alloca_used); @@ -989,7 +992,7 @@ glob_buf (const char *pattern, int flags, int (*errfu= nc) (const char *, int), } if (dirname_modified) flags &=3D ~(GLOB_NOCHECK | GLOB_NOMAGIC); - status =3D glob_in_dir (filename, dirname, flags, errfunc, pglob, + status =3D glob_in_dir (filename, globbuf, dirlen, flags, errfunc,= pglob, alloca_used); if (status !=3D 0) { @@ -1128,15 +1131,16 @@ prefix_array (const char *dirname, char **array, = size_t n) } =20 /* Like 'glob', but PATTERN is a final pathname component, - and matches are searched for in DIRECTORY. + and matches are searched for in the directory name in GLOBBUF. The GLOB_NOSORT bit in FLAGS is ignored. No sorting is ever done. The GLOB_APPEND flag is assumed to be set (always appends). */ static int -glob_in_dir (const char *pattern, const char *directory, int flags, +glob_in_dir (const char *pattern, struct scratch_buffer *globbuf, + size_t dirlen, int flags, int (*errfunc) (const char *, int), glob_t *pglob, size_t alloca_used) { - size_t dirlen =3D strlen (directory); + char *directory =3D globbuf->data; void *stream =3D NULL; # define GLOBNAMES_MEMBERS(nnames) \ struct globnames *next; size_t count; char *name[nnames]; @@ -1169,31 +1173,20 @@ glob_in_dir (const char *pattern, const char *dir= ectory, int flags, else if (meta =3D=3D GLOBPAT_NONE) { size_t patlen =3D strlen (pattern); - size_t fullsize; - bool alloca_fullname - =3D (! size_add_wrapv (dirlen + 1, patlen + 1, &fullsize) - && glob_use_alloca (alloca_used, fullsize)); - char *fullname; - if (alloca_fullname) - fullname =3D alloca_account (fullsize, alloca_used); - else + while (globbuf->length - dirlen < patlen + 2) { - fullname =3D malloc (fullsize); - if (fullname =3D=3D NULL) + if (!scratch_buffer_grow_preserve (globbuf)) return GLOB_NOSPACE; + directory =3D globbuf->data; } - - mempcpy (mempcpy (mempcpy (fullname, directory, dirlen), - "/", 1), - pattern, patlen + 1); - if (glob_lstat (pglob, flags, fullname) =3D=3D 0 + directory[dirlen] =3D '/'; + strcpy (directory + dirlen + 1, pattern); + if (glob_lstat (pglob, flags, directory) =3D=3D 0 || errno =3D=3D EOVERFLOW) /* We found this file to be existing. Now tell the rest of the function to copy this name into the result. */ flags |=3D GLOB_NOCHECK; - - if (__glibc_unlikely (!alloca_fullname)) - free (fullname); + directory[dirlen] =3D '\0'; } else { --=20 2.27.0 --------------FB3EE2282CCB683C9ACD6144--