public inbox for libc-hacker@sourceware.org
 help / color / mirror / Atom feed
* fnmatch and invalid multibyte characters
@ 2002-11-28  6:07 Andreas Schwab
  2002-11-28 10:31 ` Ulrich Drepper
  0 siblings, 1 reply; 3+ messages in thread
From: Andreas Schwab @ 2002-11-28  6:07 UTC (permalink / raw)
  To: libc-hacker

When fnmatch detects an invalid multibyte character it should fall back to
single byte matching, so that "*" has a chance to match such a string.

Andreas.

2002-11-28  Andreas Schwab  <schwab@suse.de>

	* posix/fnmatch.c (fnmatch): If conversion to wide character
	fails fall back to single byte matching.

--- posix/fnmatch.c.~1.47.~	2001-07-16 10:43:43.000000000 +0200
+++ posix/fnmatch.c	2002-11-25 15:21:12.000000000 +0100
@@ -325,6 +325,7 @@ fnmatch (pattern, string, flags)
 # if HANDLE_MULTIBYTE
   if (__builtin_expect (MB_CUR_MAX, 1) != 1)
     {
+      const char *orig_pattern = pattern;
       mbstate_t ps;
       size_t n;
       wchar_t *wpattern;
@@ -334,10 +335,8 @@ fnmatch (pattern, string, flags)
       memset (&ps, '\0', sizeof (ps));
       n = mbsrtowcs (NULL, &pattern, 0, &ps);
       if (__builtin_expect (n, 0) == (size_t) -1)
-	/* Something wrong.
-	   XXX Do we have to set `errno' to something which mbsrtows hasn't
-	   already done?  */
-	return -1;
+	/* Something wrong.  Fall back to single byte matching.  */
+	goto try_singlebyte;
       wpattern = (wchar_t *) alloca ((n + 1) * sizeof (wchar_t));
       assert (mbsinit (&ps));
       (void) mbsrtowcs (wpattern, &pattern, n + 1, &ps);
@@ -345,16 +344,17 @@ fnmatch (pattern, string, flags)
       assert (mbsinit (&ps));
       n = mbsrtowcs (NULL, &string, 0, &ps);
       if (__builtin_expect (n, 0) == (size_t) -1)
-	/* Something wrong.
-	   XXX Do we have to set `errno' to something which mbsrtows hasn't
-	   already done?  */
-	return -1;
+	/* Something wrong.  Fall back to single byte matching.  */
+	goto try_singlebyte;
       wstring = (wchar_t *) alloca ((n + 1) * sizeof (wchar_t));
       assert (mbsinit (&ps));
       (void) mbsrtowcs (wstring, &string, n + 1, &ps);
 
       return internal_fnwmatch (wpattern, wstring, wstring + n,
 				flags & FNM_PERIOD, flags);
+
+ try_singlebyte:
+      pattern = orig_pattern;
     }
 # endif  /* mbstate_t and mbsrtowcs or _LIBC.  */
 

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: fnmatch and invalid multibyte characters
  2002-11-28  6:07 fnmatch and invalid multibyte characters Andreas Schwab
@ 2002-11-28 10:31 ` Ulrich Drepper
  2002-11-29  1:58   ` Andreas Schwab
  0 siblings, 1 reply; 3+ messages in thread
From: Ulrich Drepper @ 2002-11-28 10:31 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: libc-hacker

Andreas Schwab wrote:
> When fnmatch detects an invalid multibyte character it should fall back to
> single byte matching, so that "*" has a chance to match such a string.

And why is this better or more correct?  Is there existing practice?


-- 
--------------.                        ,-.            444 Castro Street
Ulrich Drepper \    ,-----------------'   \ Mountain View, CA 94041 USA
Red Hat         `--' drepper at redhat.com `---------------------------

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: fnmatch and invalid multibyte characters
  2002-11-28 10:31 ` Ulrich Drepper
@ 2002-11-29  1:58   ` Andreas Schwab
  0 siblings, 0 replies; 3+ messages in thread
From: Andreas Schwab @ 2002-11-29  1:58 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: libc-hacker

Ulrich Drepper <drepper@redhat.com> writes:

|> Andreas Schwab wrote:
|> > When fnmatch detects an invalid multibyte character it should fall back to
|> > single byte matching, so that "*" has a chance to match such a string.
|> 
|> And why is this better or more correct?

Just because the name contains some invalid multibyte characters does not
mean the file does not exist.

|> Is there existing practice?

The version of glob distributed with bash 2.05b does the same.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 Nürnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2002-11-29  9:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-28  6:07 fnmatch and invalid multibyte characters Andreas Schwab
2002-11-28 10:31 ` Ulrich Drepper
2002-11-29  1:58   ` Andreas Schwab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).