From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 25855 invoked by alias); 27 Jun 2012 08:51:33 -0000 Received: (qmail 25841 invoked by uid 22791); 27 Jun 2012 08:51:32 -0000 X-SWARE-Spam-Status: No, hits=-3.7 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,KHOP_THREADED X-Spam-Check-By: sourceware.org Received: from localhost (HELO sourceware.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 27 Jun 2012 08:51:20 +0000 From: "valery_reznic at yahoo dot com" To: glibc-bugs-regex@sources.redhat.com Subject: [Bug regex/14301] Regular expression wrong match with a number of groups Date: Wed, 27 Jun 2012 08:51:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: regex X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: valery_reznic at yahoo dot com X-Bugzilla-Status: RESOLVED X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact glibc-bugs-regex-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-regex-owner@sourceware.org X-SW-Source: 2012-06/txt/msg00004.txt.bz2 http://sourceware.org/bugzilla/show_bug.cgi?id=14301 --- Comment #4 from Valery 2012-06-27 08:50:54 UTC --- (In reply to comment #3) > (In reply to comment #2) > > (In reply to comment #1) > > > (4[0-9]{12}) matches 4123456789012. > > > > So what? > > Regular expression required leading and trailing space to be present for match. > > I don't think so. "|" does not bind this way. a[b]|c|d[e] is equivalent to > (ab)|c|(de), not (a)(b|c|d)(e). > > > Also please note, that in case 3 and 4 regular expressions are essentially the > > same - only (4[0-9]{15}) and (4[0-9]{12}) that connected with OR swapped. > > This also explains the difference when swapping the parenthesized constructs. Got it! Leading space was part only of the first ( ) group, trailing space was part only of the last ( ) group For some reason I though that '|' has higher priority than concatenation. One more pair of the () fixed the problem. '[ ]((4[0-9]{15})|(4[0-9]{12})|(AAA))[ ]' Thank you very much for the explanation. I didn't believe how stupid I was. -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.