public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug other/35855]  New: build locale not properly handled with awk scripts
@ 2008-04-07  6:17 vapier at gentoo dot org
  2009-03-10 17:24 ` [Bug bootstrap/35855] " urmet dot saar at gmail dot com
  0 siblings, 1 reply; 2+ messages in thread
From: vapier at gentoo dot org @ 2008-04-07  6:17 UTC (permalink / raw)
  To: gcc-bugs

the gcc build system has some awk scripts that use unsafe character ranges:
$ grep a-z gcc/*.awk
gcc/optc-gen.awk:       gsub( "[^A-Za-z0-9_]", "X", macros[i] )
gcc/optc-gen.awk:       gsub ("[^A-Za-z0-9]", "_", enum)
gcc/opt-functions.awk:  gsub ("[^A-Za-z0-9]", "_", name)
gcc/opth-gen.awk:       gsub( "[^A-Za-z0-9_]", "X", macros[i] )
gcc/opth-gen.awk:       gsub ("[^A-Za-z0-9]", "_", enum)

A-Z will not match the expected alphabet (the range as defined by the "C"
locale) in all locales.  while this has always been a problem, it went
unnoticed as the incorrect munging is consistent in nature.  with gcc-4.3, the
incorrect munging results in a conflict of symbols and triggers a build
failure.

i cant seem to find any precedent as to the correct fix.  i would personally
just change everything to [:alnum:], but the only use of such character classes
that i can find via a quick grep is in the lex files.  another solution would
be to execute the awk stuff via configure as it sets up a clean environment. 
or you can prepend "env LC_ALL=C" to the AWK variable setup via configure, but
this ignores the problems that configure solves automatically.


-- 
           Summary: build locale not properly handled with awk scripts
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: vapier at gentoo dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35855


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug bootstrap/35855] build locale not properly handled with awk scripts
  2008-04-07  6:17 [Bug other/35855] New: build locale not properly handled with awk scripts vapier at gentoo dot org
@ 2009-03-10 17:24 ` urmet dot saar at gmail dot com
  0 siblings, 0 replies; 2+ messages in thread
From: urmet dot saar at gmail dot com @ 2009-03-10 17:24 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from urmet dot saar at gmail dot com  2009-03-10 17:24 -------
I can confirm this, it's been annoying me for some time.
When I changed every "A-Za-z0-9" to [:alnum:] the symbol conflicts went away
and diff confirmed that the generated files were identical to the ones
generated earlier with LC_ALL="C"


-- 

urmet dot saar at gmail dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |urmet dot saar at gmail dot
                   |                            |com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35855


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-03-10 17:24 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-07  6:17 [Bug other/35855] New: build locale not properly handled with awk scripts vapier at gentoo dot org
2009-03-10 17:24 ` [Bug bootstrap/35855] " urmet dot saar at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).