* grep rebuild? @ 2023-03-16 12:08 Corinna Vinschen 2023-03-16 16:50 ` Brian Inglis 0 siblings, 1 reply; 7+ messages in thread From: Corinna Vinschen @ 2023-03-16 12:08 UTC (permalink / raw) To: Brian Inglis, cygwin-apps Hi Brian, there's a problem with the grep package. It uses the internally provided GNULIB regex library. Unfortunately, that's the default if the system doesn't provide a more recent GLibc. Which we'll never do. The problem is this: Native language support in GNULIB's regex is *only* available, if it's built as part of GLibc. I'd like to ask you to rebuild grep 3.9 with the --without-included-regex option. That will allow grep to use Cygwin's own regex, which already comes with basic native language support, and which I'm working on to sbetter support equivalence class and collation symbol expressions. Thanks, Corinna ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: grep rebuild? 2023-03-16 12:08 grep rebuild? Corinna Vinschen @ 2023-03-16 16:50 ` Brian Inglis 2023-03-16 19:31 ` Corinna Vinschen 2023-03-17 0:50 ` Brian Inglis 0 siblings, 2 replies; 7+ messages in thread From: Brian Inglis @ 2023-03-16 16:50 UTC (permalink / raw) To: cygwin-apps On 2023-03-16 06:08, Corinna Vinschen via Cygwin-apps wrote: > Hi Brian, > > there's a problem with the grep package. It uses the internally > provided GNULIB regex library. > > Unfortunately, that's the default if the system doesn't provide a more > recent GLibc. Which we'll never do. The problem is this: Native > language support in GNULIB's regex is *only* available, if it's built as > part of GLibc. > > I'd like to ask you to rebuild grep 3.9 with the > --without-included-regex option. > > That will allow grep to use Cygwin's own regex, which already comes with > basic native language support, and which I'm working on to sbetter > support equivalence class and collation symbol expressions. Hi Corinna, We discussed this and I was going to release grep 3.8 test release 3, for testing with snapshots or when Cygwin 3.5.0 is released, then grep 3.9 came out, and I realized grep is updated every few months, so that went on the back burner. I can do a test release for 3.9-2 with that configuration change. The current release passes all the class tests and works for me and Andrey. Are there any other implications of language support affecting grep? -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: grep rebuild? 2023-03-16 16:50 ` Brian Inglis @ 2023-03-16 19:31 ` Corinna Vinschen 2023-03-16 19:35 ` Corinna Vinschen 2023-03-17 0:50 ` Brian Inglis 1 sibling, 1 reply; 7+ messages in thread From: Corinna Vinschen @ 2023-03-16 19:31 UTC (permalink / raw) To: Brian Inglis; +Cc: cygwin-apps On Mar 16 10:50, Brian Inglis via Cygwin-apps wrote: > On 2023-03-16 06:08, Corinna Vinschen via Cygwin-apps wrote: > > Hi Brian, > > > > there's a problem with the grep package. It uses the internally > > provided GNULIB regex library. > > > > Unfortunately, that's the default if the system doesn't provide a more > > recent GLibc. Which we'll never do. The problem is this: Native > > language support in GNULIB's regex is *only* available, if it's built as > > part of GLibc. > > > > I'd like to ask you to rebuild grep 3.9 with the > > --without-included-regex option. > > > > That will allow grep to use Cygwin's own regex, which already comes with > > basic native language support, and which I'm working on to sbetter > > support equivalence class and collation symbol expressions. > > Hi Corinna, > > We discussed this and I was going to release grep 3.8 test release 3, for > testing with snapshots or when Cygwin 3.5.0 is released, then grep 3.9 came > out, and I realized grep is updated every few months, so that went on the > back burner. I can do a test release for 3.9-2 with that configuration > change. > > The current release passes all the class tests and works for me and Andrey. > Are there any other implications of language support affecting grep? As I wrote above, equivalence class and collation symbol expressions. Character clasess are easy and basically always supported, they don't really count. Here's what I expect to work: First an example with equivalence class. "./fnmatch" is a simple application calling fnmatch, with 1st arg being the glob expression and the 2nd arg being the search expression. Locale is simple en_US.utf8. Note the accented uppercase À! $ /fnmatch '[[=a=]]' 'a' fnmatch ([[=a=]], a, 0) = 0 (en_US.utf8) $ ./fnmatch '[[=a=]]' 'b' fnmatch ([[=a=]], b, 0) = 1 (en_US.utf8) $ ./fnmatch '[[=a=]]' 'À' fnmatch ([[=a=]], À, 0) = 0 (en_US.utf8) $ ./fnmatch '[[=À=]]' 'a' fnmatch ([[=À=]], a, 0) = 0 (en_US.utf8) As you can see, the non-accented a and the accented À belong to the same equivalence class. Now let's try grep on Cygwin: $ echo 'a' | LC_COLLATE=en_US.utf8 grep '[[=a=]]' a $ echo 'b' | LC_COLLATE=en_US.utf8 grep '[[=a=]]' $ echo 'À' | LC_COLLATE=en_US.utf8 grep '[[=a=]]' $ echo 'a' | LC_COLLATE=en_US.utf8 grep '[[=À=]]' grep: Invalid collation character The first two results are expected, but not the third and forth result. Let's try the same on Linux: $ echo 'a' | LC_COLLATE=en_US.utf8 grep '[[=a=]]' a $ echo 'b' | LC_COLLATE=en_US.utf8 grep '[[=a=]]' $ echo 'À' | LC_COLLATE=en_US.utf8 grep '[[=a=]]' À $ echo 'a' | LC_COLLATE=en_US.utf8 grep '[[=À=]]' a See the difference? Next, let's try a collating element: "./glob" is a simple test app calling glob and setting the locale to the second argument. There's a file called "chakref" in the CWD: There's no collating element "ch" in English: $ ./glob '[[.ch.]]*' en_US.utf8 glob ([[.ch.]]*) = -3 But in Czech: $ ./glob '[[.ch.]]*' cs_CZ.utf8 chakref Try this with current grep: $ ls -1 | LC_COLLATE=en_US.utf8 grep '^[[.ch.]].*' grep: Invalid collation character Ok. $ ls -1 | LC_COLLATE=cs_CZ.utf8 grep '^[[.ch.]].*' grep: Invalid collation character Not ok. On Linux: $ ls -1 | LC_COLLATE=en_US.utf8 grep '^[[.ch.]].*' grep: Invalid collation character Ok. *[~]$ ls -1 | LC_COLLATE=cs_CZ.utf8 grep '^[[.ch.]].*' chakref Ok. Please note that, right now, collating symbols and equivalence classes *only* work in the Cygwin main branch in glob(3) and fnmatch(3), but NOT YET in regex(3). That's what I'm planning to add in the next couple of weeks (or months...) Thanks, Corinna ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: grep rebuild? 2023-03-16 19:31 ` Corinna Vinschen @ 2023-03-16 19:35 ` Corinna Vinschen 0 siblings, 0 replies; 7+ messages in thread From: Corinna Vinschen @ 2023-03-16 19:35 UTC (permalink / raw) To: cygwin-apps; +Cc: Brian Inglis On Mar 16 20:31, Corinna Vinschen via Cygwin-apps wrote: > Please note that, right now, collating symbols and equivalence classes > *only* work in the Cygwin main branch in glob(3) and fnmatch(3), but NOT > YET in regex(3). That's what I'm planning to add in the next couple of > weeks (or months...) And, just to be clear, that's what we *never* get with GNULIB's regex(3). The reason is that collating and equivalence are properties which can only be handled correctly in that regex implementation, if the code has access to the locale info files maintained by the Glibc code. Corinna ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: grep rebuild? 2023-03-16 16:50 ` Brian Inglis 2023-03-16 19:31 ` Corinna Vinschen @ 2023-03-17 0:50 ` Brian Inglis 2023-03-17 9:03 ` Corinna Vinschen 1 sibling, 1 reply; 7+ messages in thread From: Brian Inglis @ 2023-03-17 0:50 UTC (permalink / raw) To: cygwin-apps On 2023-03-16 10:50, Brian Inglis via Cygwin-apps wrote: > On 2023-03-16 06:08, Corinna Vinschen via Cygwin-apps wrote: >> Hi Brian, >> >> there's a problem with the grep package. It uses the internally >> provided GNULIB regex library. >> >> Unfortunately, that's the default if the system doesn't provide a more >> recent GLibc. Which we'll never do. The problem is this: Native >> language support in GNULIB's regex is *only* available, if it's built as >> part of GLibc. >> >> I'd like to ask you to rebuild grep 3.9 with the >> --without-included-regex option. >> >> That will allow grep to use Cygwin's own regex, which already comes with >> basic native language support, and which I'm working on to sbetter >> support equivalence class and collation symbol expressions. > > Hi Corinna, > > We discussed this and I was going to release grep 3.8 test release 3, for > testing with snapshots or when Cygwin 3.5.0 is released, then grep 3.9 came out, > and I realized grep is updated every few months, so that went on the back > burner. I can do a test release for 3.9-2 with that configuration change. > > The current release passes all the class tests and works for me and Andrey. > Are there any other implications of language support affecting grep? Config option --without-included-regex no longer seems to build with grep 3.9 on Cygwin - may require glibc regex - or may now autoconfig depending on [g]libc? /usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld: dfasearch.o: in function `regex_compile': /usr/src/debug/grep-3.9-2/src/dfasearch.c:159: undefined reference to `re_set_syntax' /usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld: /usr/src/debug/grep-3.9-2/src/dfasearch.c:163: undefined reference to `re_compile_pattern' /usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld: /usr/src/debug/grep-3.9-2/src/dfasearch.c:161: undefined reference to `re_set_syntax' /usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld: /usr/src/debug/grep-3.9-2/src/dfasearch.c:163: undefined reference to `re_compile_pattern' /usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld: dfasearch.o: in function `EGexecute': /usr/src/debug/grep-3.9-2/src/dfasearch.c:555: undefined reference to `re_search' /usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld: /usr/src/debug/grep-3.9-2/src/dfasearch.c:502: undefined reference to `re_search' /usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld: /usr/src/debug/grep-3.9-2/src/dfasearch.c:540: undefined reference to `re_match' collect2: error: ld returned 1 exit status make[2]: *** [Makefile:1834: grep.exe] Error 1 -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: grep rebuild? 2023-03-17 0:50 ` Brian Inglis @ 2023-03-17 9:03 ` Corinna Vinschen 2023-03-17 9:15 ` Corinna Vinschen 0 siblings, 1 reply; 7+ messages in thread From: Corinna Vinschen @ 2023-03-17 9:03 UTC (permalink / raw) To: cygwin-apps; +Cc: Brian Inglis On Mar 16 18:50, Brian Inglis via Cygwin-apps wrote: > On 2023-03-16 10:50, Brian Inglis via Cygwin-apps wrote: > > On 2023-03-16 06:08, Corinna Vinschen via Cygwin-apps wrote: > > > Hi Brian, > > > > > > there's a problem with the grep package. It uses the internally > > > provided GNULIB regex library. > > > > > > Unfortunately, that's the default if the system doesn't provide a more > > > recent GLibc. Which we'll never do. The problem is this: Native > > > language support in GNULIB's regex is *only* available, if it's built as > > > part of GLibc. > > > > > > I'd like to ask you to rebuild grep 3.9 with the > > > --without-included-regex option. > > > > > > That will allow grep to use Cygwin's own regex, which already comes with > > > basic native language support, and which I'm working on to sbetter > > > support equivalence class and collation symbol expressions. > > > > Hi Corinna, > > > > We discussed this and I was going to release grep 3.8 test release 3, > > for testing with snapshots or when Cygwin 3.5.0 is released, then grep > > 3.9 came out, and I realized grep is updated every few months, so that > > went on the back burner. I can do a test release for 3.9-2 with that > > configuration change. > > > > The current release passes all the class tests and works for me and Andrey. > > Are there any other implications of language support affecting grep? > > Config option --without-included-regex no longer seems to build with grep > 3.9 on Cygwin - may require glibc regex - or may now autoconfig depending on > [g]libc? > > /usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld: > dfasearch.o: in function `regex_compile': > /usr/src/debug/grep-3.9-2/src/dfasearch.c:159: undefined reference to > `re_set_syntax' What a piece of crap! So you either run a GLibc system, or you're forced to use GNULIB regex because grep uses non-standard functions in the generic code. We should switch to FreeBSD grep, it still uses POSIX functions. What a laugh... Not amused, Corinna ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: grep rebuild? 2023-03-17 9:03 ` Corinna Vinschen @ 2023-03-17 9:15 ` Corinna Vinschen 0 siblings, 0 replies; 7+ messages in thread From: Corinna Vinschen @ 2023-03-17 9:15 UTC (permalink / raw) To: cygwin-apps; +Cc: Brian Inglis On Mar 17 10:03, Corinna Vinschen via Cygwin-apps wrote: > On Mar 16 18:50, Brian Inglis via Cygwin-apps wrote: > > On 2023-03-16 10:50, Brian Inglis via Cygwin-apps wrote: > > > On 2023-03-16 06:08, Corinna Vinschen via Cygwin-apps wrote: > > > > Hi Brian, > > > > > > > > there's a problem with the grep package. It uses the internally > > > > provided GNULIB regex library. > > > > > > > > Unfortunately, that's the default if the system doesn't provide a more > > > > recent GLibc. Which we'll never do. The problem is this: Native > > > > language support in GNULIB's regex is *only* available, if it's built as > > > > part of GLibc. > > > > > > > > I'd like to ask you to rebuild grep 3.9 with the > > > > --without-included-regex option. > > > > > > > > That will allow grep to use Cygwin's own regex, which already comes with > > > > basic native language support, and which I'm working on to sbetter > > > > support equivalence class and collation symbol expressions. > > > > > > Hi Corinna, > > > > > > We discussed this and I was going to release grep 3.8 test release 3, > > > for testing with snapshots or when Cygwin 3.5.0 is released, then grep > > > 3.9 came out, and I realized grep is updated every few months, so that > > > went on the back burner. I can do a test release for 3.9-2 with that > > > configuration change. > > > > > > The current release passes all the class tests and works for me and Andrey. > > > Are there any other implications of language support affecting grep? > > > > Config option --without-included-regex no longer seems to build with grep > > 3.9 on Cygwin - may require glibc regex - or may now autoconfig depending on > > [g]libc? > > > > /usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld: > > dfasearch.o: in function `regex_compile': > > /usr/src/debug/grep-3.9-2/src/dfasearch.c:159: undefined reference to > > `re_set_syntax' > > What a piece of crap! So you either run a GLibc system, or you're > forced to use GNULIB regex because grep uses non-standard functions > in the generic code. > > We should switch to FreeBSD grep, it still uses POSIX functions. > What a laugh... And just for kicks, FreeBSD grep is mostly option compatible with GNU grep. The missing options are: -P, --perl-regexp --no-ignore-case (but --ignore-case exists) -y (obsolete anyway) -T, --initial-tab -Z (but --null exists) --group-separator --no-group-separator --exclude-from -I (but --binary-files exists) -R, --dereference-recursive I wonder if I should create a freebsd-grep package, installing its grep as 'bsdgrep' or something.... Corinna ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-03-17 9:15 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-03-16 12:08 grep rebuild? Corinna Vinschen 2023-03-16 16:50 ` Brian Inglis 2023-03-16 19:31 ` Corinna Vinschen 2023-03-16 19:35 ` Corinna Vinschen 2023-03-17 0:50 ` Brian Inglis 2023-03-17 9:03 ` Corinna Vinschen 2023-03-17 9:15 ` Corinna Vinschen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).