* [Bug libc/15854] strtod should avoid calling strlen
2013-08-20 2:12 [Bug libc/15854] New: strtod should avoid calling strlen emogenet at gmail dot com
@ 2013-08-20 2:13 ` emogenet at gmail dot com
2013-08-20 7:11 ` neleai at seznam dot cz
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: emogenet at gmail dot com @ 2013-08-20 2:13 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=15854
emogenet at gmail dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |emogenet at gmail dot com
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug libc/15854] strtod should avoid calling strlen
2013-08-20 2:12 [Bug libc/15854] New: strtod should avoid calling strlen emogenet at gmail dot com
2013-08-20 2:13 ` [Bug libc/15854] " emogenet at gmail dot com
@ 2013-08-20 7:11 ` neleai at seznam dot cz
2013-08-20 7:11 ` [Bug libc/15854] New: " Ondřej Bílka
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: neleai at seznam dot cz @ 2013-08-20 7:11 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=15854
--- Comment #1 from Ondrej Bilka <neleai at seznam dot cz> ---
On Tue, Aug 20, 2013 at 02:12:32AM +0000, emogenet at gmail dot com wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=15854
>
> Bug ID: 15854
> Summary: strtod should avoid calling strlen
> Product: glibc
> Version: 2.18
> Status: NEW
> Severity: enhancement
> Priority: P2
> Component: libc
> Assignee: unassigned at sourceware dot org
> Reporter: emogenet at gmail dot com
> CC: drepper.fsp at gmail dot com
>
> Problem : glibc's strtod seem to systematically call strlen on its input.
>
> To the layman that I am, there doesn't seem to be any legitimate reason why it
> should: it seems that strtod should simply consume its input one char at a time
> until it reaches a char that marks the end of a valid FP number ASCII rep. and
> should therefore work on a non-zero terminated buffer, as long said buffer ends
> with a char that terminates the parsing.
>
This is not that big problem, strtod only uses strlen in following context
decimal = _NL_CURRENT (LC_NUMERIC, DECIMAL_POINT); // which is "."
decimal_len = strlen (decimal); // which is 1
> This internal call to strlen makes it essentially impossible to call strtod
> on a no zero terminated buffer, and there seems to be no other way to otherwise
> access the non-trivial code that converts an ASCII buffer to a FP number.
>
> This makes it in particular painful to call strtod on a very large mmap'd
> buffer of ASCII floats : strlen will plow through the entire file for every
> call to strtod, making things highly inefficient (it is also not guaranteed
> not to crash).
>
Do you have testcase to demonstrate quadratic behavior? It is possible
that end is determined by other ineffective means.
> To work around this shortcoming, one ends up having to figure out the end of
> the FP ASCII string, "by hand", copy the result to a zero terminated buffer,
> and then call strtod on that.
>
> This is both inefficient and clunky.
>
> See this article for a good description of the issue:
>
> http://www.ryanjuckett.com/programming/c-cplusplus/25-optimizing-atof-and-strtod
>
> Here's another instance of the problem:
>
> http://stackoverflow.com/questions/2033845/any-one-know-how-to-convert-a-huge-char-array-to-float-very-huge-array-perform
>
Not relevant for us as these are windows problems.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug libc/15854] New: strtod should avoid calling strlen
2013-08-20 2:12 [Bug libc/15854] New: strtod should avoid calling strlen emogenet at gmail dot com
2013-08-20 2:13 ` [Bug libc/15854] " emogenet at gmail dot com
2013-08-20 7:11 ` neleai at seznam dot cz
@ 2013-08-20 7:11 ` Ondřej Bílka
2013-08-20 7:29 ` [Bug libc/15854] " emogenet at gmail dot com
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Ondřej Bílka @ 2013-08-20 7:11 UTC (permalink / raw)
To: emogenet at gmail dot com; +Cc: glibc-bugs
On Tue, Aug 20, 2013 at 02:12:32AM +0000, emogenet at gmail dot com wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=15854
>
> Bug ID: 15854
> Summary: strtod should avoid calling strlen
> Product: glibc
> Version: 2.18
> Status: NEW
> Severity: enhancement
> Priority: P2
> Component: libc
> Assignee: unassigned at sourceware dot org
> Reporter: emogenet at gmail dot com
> CC: drepper.fsp at gmail dot com
>
> Problem : glibc's strtod seem to systematically call strlen on its input.
>
> To the layman that I am, there doesn't seem to be any legitimate reason why it
> should: it seems that strtod should simply consume its input one char at a time
> until it reaches a char that marks the end of a valid FP number ASCII rep. and
> should therefore work on a non-zero terminated buffer, as long said buffer ends
> with a char that terminates the parsing.
>
This is not that big problem, strtod only uses strlen in following context
decimal = _NL_CURRENT (LC_NUMERIC, DECIMAL_POINT); // which is "."
decimal_len = strlen (decimal); // which is 1
> This internal call to strlen makes it essentially impossible to call strtod
> on a no zero terminated buffer, and there seems to be no other way to otherwise
> access the non-trivial code that converts an ASCII buffer to a FP number.
>
> This makes it in particular painful to call strtod on a very large mmap'd
> buffer of ASCII floats : strlen will plow through the entire file for every
> call to strtod, making things highly inefficient (it is also not guaranteed
> not to crash).
>
Do you have testcase to demonstrate quadratic behavior? It is possible
that end is determined by other ineffective means.
> To work around this shortcoming, one ends up having to figure out the end of
> the FP ASCII string, "by hand", copy the result to a zero terminated buffer,
> and then call strtod on that.
>
> This is both inefficient and clunky.
>
> See this article for a good description of the issue:
>
> http://www.ryanjuckett.com/programming/c-cplusplus/25-optimizing-atof-and-strtod
>
> Here's another instance of the problem:
>
> http://stackoverflow.com/questions/2033845/any-one-know-how-to-convert-a-huge-char-array-to-float-very-huge-array-perform
>
Not relevant for us as these are windows problems.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug libc/15854] strtod should avoid calling strlen
2013-08-20 2:12 [Bug libc/15854] New: strtod should avoid calling strlen emogenet at gmail dot com
` (2 preceding siblings ...)
2013-08-20 7:11 ` [Bug libc/15854] New: " Ondřej Bílka
@ 2013-08-20 7:29 ` emogenet at gmail dot com
2013-08-24 7:42 ` Ondřej Bílka
2013-08-20 7:32 ` allan at archlinux dot org
` (2 subsequent siblings)
6 siblings, 1 reply; 9+ messages in thread
From: emogenet at gmail dot com @ 2013-08-20 7:29 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=15854
--- Comment #2 from emogenet at gmail dot com ---
You are correct, the call tp strlen in strtod is not a problem. I
incrorectly assumed
it was calling strlen on the whole buffer because sscanf does exhibit the
problem I
describe, but as it turns out, the problem is inherent to sscanf, and
strtod works fine.
As a matter of fact, I just tested glibc's strtod on a very large ASCII
mmap'd buffer
just now, an it works fine, no quadratic behavior.
Apologies for not testing this better before reporting the bug. Please feel
free to close.
- Emmanuel
On Tue, Aug 20, 2013 at 9:11 AM, neleai at seznam dot cz <
sourceware-bugzilla@sourceware.org> wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=15854
>
> --- Comment #1 from Ondrej Bilka <neleai at seznam dot cz> ---
> On Tue, Aug 20, 2013 at 02:12:32AM +0000, emogenet at gmail dot com wrote:
> > http://sourceware.org/bugzilla/show_bug.cgi?id=15854
> >
> > Bug ID: 15854
> > Summary: strtod should avoid calling strlen
> > Product: glibc
> > Version: 2.18
> > Status: NEW
> > Severity: enhancement
> > Priority: P2
> > Component: libc
> > Assignee: unassigned at sourceware dot org
> > Reporter: emogenet at gmail dot com
> > CC: drepper.fsp at gmail dot com
> >
> > Problem : glibc's strtod seem to systematically call strlen on its input.
> >
> > To the layman that I am, there doesn't seem to be any legitimate reason
> why it
> > should: it seems that strtod should simply consume its input one char at
> a time
> > until it reaches a char that marks the end of a valid FP number ASCII
> rep. and
> > should therefore work on a non-zero terminated buffer, as long said
> buffer ends
> > with a char that terminates the parsing.
> >
> This is not that big problem, strtod only uses strlen in following context
>
> decimal = _NL_CURRENT (LC_NUMERIC, DECIMAL_POINT); // which is "."
> decimal_len = strlen (decimal); // which is 1
>
>
> > This internal call to strlen makes it essentially impossible to call
> strtod
> > on a no zero terminated buffer, and there seems to be no other way to
> otherwise
> > access the non-trivial code that converts an ASCII buffer to a FP number.
> >
> > This makes it in particular painful to call strtod on a very large mmap'd
> > buffer of ASCII floats : strlen will plow through the entire file for
> every
> > call to strtod, making things highly inefficient (it is also not
> guaranteed
> > not to crash).
> >
> Do you have testcase to demonstrate quadratic behavior? It is possible
> that end is determined by other ineffective means.
>
> > To work around this shortcoming, one ends up having to figure out the
> end of
> > the FP ASCII string, "by hand", copy the result to a zero terminated
> buffer,
> > and then call strtod on that.
> >
> > This is both inefficient and clunky.
> >
> > See this article for a good description of the issue:
> >
> >
> http://www.ryanjuckett.com/programming/c-cplusplus/25-optimizing-atof-and-strtod
> >
> > Here's another instance of the problem:
> >
> >
> http://stackoverflow.com/questions/2033845/any-one-know-how-to-convert-a-huge-char-array-to-float-very-huge-array-perform
> >
> Not relevant for us as these are windows problems.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
> You reported the bug.
>
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug libc/15854] strtod should avoid calling strlen
2013-08-20 7:29 ` [Bug libc/15854] " emogenet at gmail dot com
@ 2013-08-24 7:42 ` Ondřej Bílka
0 siblings, 0 replies; 9+ messages in thread
From: Ondřej Bílka @ 2013-08-24 7:42 UTC (permalink / raw)
To: emogenet at gmail dot com; +Cc: glibc-bugs
On Tue, Aug 20, 2013 at 07:29:21AM +0000, emogenet at gmail dot com wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=15854
>
> --- Comment #2 from emogenet at gmail dot com ---
> You are correct, the call tp strlen in strtod is not a problem. I
> incrorectly assumed
> it was calling strlen on the whole buffer because sscanf does exhibit the
> problem I
> describe, but as it turns out, the problem is inherent to sscanf, and
> strtod works fine.
>
And could you provide sscanf testcase as separate bug report?
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug libc/15854] strtod should avoid calling strlen
2013-08-20 2:12 [Bug libc/15854] New: strtod should avoid calling strlen emogenet at gmail dot com
` (3 preceding siblings ...)
2013-08-20 7:29 ` [Bug libc/15854] " emogenet at gmail dot com
@ 2013-08-20 7:32 ` allan at archlinux dot org
2013-08-24 7:42 ` neleai at seznam dot cz
2014-06-13 13:08 ` fweimer at redhat dot com
6 siblings, 0 replies; 9+ messages in thread
From: allan at archlinux dot org @ 2013-08-20 7:32 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=15854
Allan McRae <allan at archlinux dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
CC| |allan at archlinux dot org
Resolution|--- |INVALID
--- Comment #3 from Allan McRae <allan at archlinux dot org> ---
Closing.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug libc/15854] strtod should avoid calling strlen
2013-08-20 2:12 [Bug libc/15854] New: strtod should avoid calling strlen emogenet at gmail dot com
` (4 preceding siblings ...)
2013-08-20 7:32 ` allan at archlinux dot org
@ 2013-08-24 7:42 ` neleai at seznam dot cz
2014-06-13 13:08 ` fweimer at redhat dot com
6 siblings, 0 replies; 9+ messages in thread
From: neleai at seznam dot cz @ 2013-08-24 7:42 UTC (permalink / raw)
To: glibc-bugs
http://sourceware.org/bugzilla/show_bug.cgi?id=15854
--- Comment #4 from Ondrej Bilka <neleai at seznam dot cz> ---
On Tue, Aug 20, 2013 at 07:29:21AM +0000, emogenet at gmail dot com wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=15854
>
> --- Comment #2 from emogenet at gmail dot com ---
> You are correct, the call tp strlen in strtod is not a problem. I
> incrorectly assumed
> it was calling strlen on the whole buffer because sscanf does exhibit the
> problem I
> describe, but as it turns out, the problem is inherent to sscanf, and
> strtod works fine.
>
And could you provide sscanf testcase as separate bug report?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug libc/15854] strtod should avoid calling strlen
2013-08-20 2:12 [Bug libc/15854] New: strtod should avoid calling strlen emogenet at gmail dot com
` (5 preceding siblings ...)
2013-08-24 7:42 ` neleai at seznam dot cz
@ 2014-06-13 13:08 ` fweimer at redhat dot com
6 siblings, 0 replies; 9+ messages in thread
From: fweimer at redhat dot com @ 2014-06-13 13:08 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=15854
Florian Weimer <fweimer at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Flags| |security-
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread