From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4214 invoked by alias); 17 Jun 2009 15:47:18 -0000 Received: (qmail 4175 invoked by uid 48); 17 Jun 2009 15:47:05 -0000 Date: Wed, 17 Jun 2009 15:47:00 -0000 From: "jbastian at redhat dot com" To: glibc-bugs-regex@sources.redhat.com Message-ID: <20090617154705.10290.jbastian@redhat.com> Reply-To: sourceware-bugzilla@sourceware.org Subject: [Bug regex/10290] New: using REG_ICASE can break ranges X-Bugzilla-Reason: CC Mailing-List: contact glibc-bugs-regex-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-regex-owner@sourceware.org X-SW-Source: 2009-06/txt/msg00000.txt.bz2 Using a regular expression range like [C-a] works fine if compiled with regcomp() with just the REG_EXTENDED flag, but if the REG_ICASE flag is added too, regcomp() returns an error "Invalid range end". Testing other ranges with REG_ICASE reveals: [A-Z^-z] is invalid: Invalid range end (11) [A-Z^_`a-z] is ok [C-a] is invalid: Invalid range end (11) [C-f] is ok [_-a] is invalid: Invalid range end (11) [<-a] is ok [z-{] is ok It appears that regcomp() is capitalizing the range if the REG_ICASE flag is used, thus [C-a] becomes [C-A] and since A comes before C, the range is invalid. Likewise, in locales that match ASCII, ^ becomes before z, but after Z, so [A-Z^-z] becomes invalid, and _ comes after A but before a, so [_-a] becomes invalid. If this is not considered a bug, then at the very least, the regex(3) man page should note the side-effects of using REG_ICASE. -- Summary: using REG_ICASE can break ranges Product: glibc Version: 2.9 Status: NEW Severity: normal Priority: P2 Component: regex AssignedTo: drepper at redhat dot com ReportedBy: jbastian at redhat dot com CC: glibc-bugs-regex at sources dot redhat dot com,glibc- bugs at sources dot redhat dot com http://sourceware.org/bugzilla/show_bug.cgi?id=10290 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.