From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17119 invoked by alias); 26 Nov 2012 08:26:28 -0000 Received: (qmail 16965 invoked by uid 48); 26 Nov 2012 08:26:10 -0000 From: "allachan at au1 dot ibm.com" To: glibc-bugs@sources.redhat.com Subject: [Bug stdio/12701] scanf accepts non-matching input Date: Mon, 26 Nov 2012 08:26:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: stdio X-Bugzilla-Keywords: X-Bugzilla-Severity: critical X-Bugzilla-Who: allachan at au1 dot ibm.com X-Bugzilla-Status: REOPENED X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: CC Message-ID: In-Reply-To: References: X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org X-SW-Source: 2012-11/txt/msg00219.txt.bz2 http://sourceware.org/bugzilla/show_bug.cgi?id=12701 paxdiablo changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |allachan at au1 dot ibm.com --- Comment #15 from paxdiablo 2012-11-26 08:26:04 UTC --- I think this bug report is correct, at least in relation to the '%x/0xz' sample. There's a big difference between an input item, which *may* be an initial subset of a properly scanned directive, and the *properly scanned directive* itself. Pushback controls how far you can back up the "input stream pointer" and is the reason why scanf is usually not used by professionals, who prefer a fgets/sscanf combo so they can bak up to the start of the line themselves. However, the pushback is only relevant here in that context. The failure of '0x' when scanning '%x' will not be able to push back all the way to the '0' because of this limitation. The function call sscanf ("a0xz", "%c%x%c") should return 1, not 3. The controlling part of the standard is the bit dealing with the 'x' directive itself: ===== Matches an optionally signed hexadecimal integer, whose format is the same as expected for the subject sequence of the strtoul function with the value 16 for the base argument. ===== The strtoul stuff states: ===== If the value of base is zero, the expected form of the subject sequence is that of an integer constant as described in 6.4.4.1, optionally preceded by a plus or minus sign, but not including an integer suffix. If the value of base is between 2 and 36 (inclusive), the expected form of the subject sequence is a sequence of letters and digits representing an integer with the radix specified by base, optionally preceded by a plus or minus sign, but not including an integer suffix. The letters from a (or A) through z (or Z) are ascribed the values 10 through 35; only letters and digits whose ascribed values are less than that of base are permitted. If the value of base is 16, the characters 0x or 0X may optionally precede the sequence of letters and digits, following the sign if present. ===== The controlling part there would be "a sequence of letters and digits representing an integer" - you may argue that such a sequence may consist of zero characters but I don't think anyone in their right mind would suggest that definition represented an integer. In any case, the '0x' string fails on strtoul: char *x; int rc = 42; rc = strtoul ("0x", &x, 16); printf ("%d [%s]/n", rc, x); produces: 0 [0x] So even though rc is set to 0, the fact that the pointer points to the first bad character means that the '0x' itself is not a valid hex number. Putting in '0x5' as the string gives you: 5 [] so that the first bad character is the end of the string (ie, there WERE no bad characters). -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.