public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "emogenet at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug libc/15854] strtod should avoid calling strlen
Date: Tue, 20 Aug 2013 07:29:00 -0000	[thread overview]
Message-ID: <bug-15854-131-NO6oT57Biv@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-15854-131@http.sourceware.org/bugzilla/>

http://sourceware.org/bugzilla/show_bug.cgi?id=15854

--- Comment #2 from emogenet at gmail dot com ---
You are correct, the call tp strlen in strtod is not a problem. I
incrorectly assumed
it was calling strlen on the whole buffer because sscanf does exhibit the
problem I
describe, but as it turns out, the problem is inherent to sscanf, and
strtod works fine.

As a matter of fact, I just tested glibc's strtod on a very large ASCII
mmap'd buffer
just now, an it works fine, no quadratic behavior.

Apologies for not testing this better before reporting the bug. Please feel
free to close.

   - Emmanuel



On Tue, Aug 20, 2013 at 9:11 AM, neleai at seznam dot cz <
sourceware-bugzilla@sourceware.org> wrote:

> http://sourceware.org/bugzilla/show_bug.cgi?id=15854
>
> --- Comment #1 from Ondrej Bilka <neleai at seznam dot cz> ---
> On Tue, Aug 20, 2013 at 02:12:32AM +0000, emogenet at gmail dot com wrote:
> > http://sourceware.org/bugzilla/show_bug.cgi?id=15854
> >
> >             Bug ID: 15854
> >            Summary: strtod should avoid calling strlen
> >            Product: glibc
> >            Version: 2.18
> >             Status: NEW
> >           Severity: enhancement
> >           Priority: P2
> >          Component: libc
> >           Assignee: unassigned at sourceware dot org
> >           Reporter: emogenet at gmail dot com
> >                 CC: drepper.fsp at gmail dot com
> >
> > Problem : glibc's strtod seem to systematically call strlen on its input.
> >
> > To the layman that I am, there doesn't seem to be any legitimate reason
> why it
> > should: it seems that strtod should simply consume its input one char at
> a time
> > until it reaches a char that marks the end of a valid FP number ASCII
> rep. and
> > should therefore work on a non-zero terminated buffer, as long said
> buffer ends
> > with a char that terminates the parsing.
> >
> This is not that big problem, strtod only uses strlen in following context
>
>   decimal = _NL_CURRENT (LC_NUMERIC, DECIMAL_POINT); // which is "."
>   decimal_len = strlen (decimal); // which is 1
>
>
> > This internal call to strlen makes it essentially impossible to call
> strtod
> > on a no zero terminated buffer, and there seems to be no other way to
> otherwise
> > access the non-trivial code that converts an ASCII buffer to a FP number.
> >
> > This makes it in particular painful to call strtod on a very large mmap'd
> > buffer of ASCII floats : strlen will plow through the entire file for
> every
> > call to strtod, making things highly inefficient (it is also not
> guaranteed
> > not to crash).
> >
> Do you have testcase to demonstrate quadratic behavior? It is possible
> that end is determined by other ineffective means.
>
> > To work around this shortcoming, one ends up having to figure out the
> end of
> > the FP ASCII string, "by hand", copy the result to a zero terminated
> buffer,
> > and then call strtod on that.
> >
> > This is both inefficient and clunky.
> >
> > See this article for a good description of the issue:
> >
> >
> http://www.ryanjuckett.com/programming/c-cplusplus/25-optimizing-atof-and-strtod
> >
> > Here's another instance of the problem:
> >
> >
> http://stackoverflow.com/questions/2033845/any-one-know-how-to-convert-a-huge-char-array-to-float-very-huge-array-perform
> >
> Not relevant for us as these are windows problems.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
> You reported the bug.
>

-- 
You are receiving this mail because:
You are on the CC list for the bug.


  parent reply	other threads:[~2013-08-20  7:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-20  2:12 [Bug libc/15854] New: " emogenet at gmail dot com
2013-08-20  2:13 ` [Bug libc/15854] " emogenet at gmail dot com
2013-08-20  7:11 ` neleai at seznam dot cz
2013-08-20  7:11 ` [Bug libc/15854] New: " Ondřej Bílka
2013-08-20  7:29 ` emogenet at gmail dot com [this message]
2013-08-24  7:42   ` [Bug libc/15854] " Ondřej Bílka
2013-08-20  7:32 ` allan at archlinux dot org
2013-08-24  7:42 ` neleai at seznam dot cz
2014-06-13 13:08 ` fweimer at redhat dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-15854-131-NO6oT57Biv@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).