public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
From: "Ondřej Bílka" <neleai@seznam.cz> To: emogenet at gmail dot com <sourceware-bugzilla@sourceware.org> Cc: glibc-bugs@sourceware.org Subject: Re: [Bug libc/15854] New: strtod should avoid calling strlen Date: Tue, 20 Aug 2013 07:11:00 -0000 [thread overview] Message-ID: <20130820071124.GA5814@domone.kolej.mff.cuni.cz> (raw) In-Reply-To: <bug-15854-131@http.sourceware.org/bugzilla/> On Tue, Aug 20, 2013 at 02:12:32AM +0000, emogenet at gmail dot com wrote: > http://sourceware.org/bugzilla/show_bug.cgi?id=15854 > > Bug ID: 15854 > Summary: strtod should avoid calling strlen > Product: glibc > Version: 2.18 > Status: NEW > Severity: enhancement > Priority: P2 > Component: libc > Assignee: unassigned at sourceware dot org > Reporter: emogenet at gmail dot com > CC: drepper.fsp at gmail dot com > > Problem : glibc's strtod seem to systematically call strlen on its input. > > To the layman that I am, there doesn't seem to be any legitimate reason why it > should: it seems that strtod should simply consume its input one char at a time > until it reaches a char that marks the end of a valid FP number ASCII rep. and > should therefore work on a non-zero terminated buffer, as long said buffer ends > with a char that terminates the parsing. > This is not that big problem, strtod only uses strlen in following context decimal = _NL_CURRENT (LC_NUMERIC, DECIMAL_POINT); // which is "." decimal_len = strlen (decimal); // which is 1 > This internal call to strlen makes it essentially impossible to call strtod > on a no zero terminated buffer, and there seems to be no other way to otherwise > access the non-trivial code that converts an ASCII buffer to a FP number. > > This makes it in particular painful to call strtod on a very large mmap'd > buffer of ASCII floats : strlen will plow through the entire file for every > call to strtod, making things highly inefficient (it is also not guaranteed > not to crash). > Do you have testcase to demonstrate quadratic behavior? It is possible that end is determined by other ineffective means. > To work around this shortcoming, one ends up having to figure out the end of > the FP ASCII string, "by hand", copy the result to a zero terminated buffer, > and then call strtod on that. > > This is both inefficient and clunky. > > See this article for a good description of the issue: > > http://www.ryanjuckett.com/programming/c-cplusplus/25-optimizing-atof-and-strtod > > Here's another instance of the problem: > > http://stackoverflow.com/questions/2033845/any-one-know-how-to-convert-a-huge-char-array-to-float-very-huge-array-perform > Not relevant for us as these are windows problems.
next prev parent reply other threads:[~2013-08-20 7:11 UTC|newest] Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2013-08-20 2:12 emogenet at gmail dot com 2013-08-20 2:13 ` [Bug libc/15854] " emogenet at gmail dot com 2013-08-20 7:11 ` neleai at seznam dot cz 2013-08-20 7:11 ` Ondřej Bílka [this message] 2013-08-20 7:29 ` emogenet at gmail dot com 2013-08-24 7:42 ` Ondřej Bílka 2013-08-20 7:32 ` allan at archlinux dot org 2013-08-24 7:42 ` neleai at seznam dot cz 2014-06-13 13:08 ` fweimer at redhat dot com
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20130820071124.GA5814@domone.kolej.mff.cuni.cz \ --to=neleai@seznam.cz \ --cc=glibc-bugs@sourceware.org \ --cc=sourceware-bugzilla@sourceware.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).