[RFC][BZ #16009] Memory handling in strxfrm

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* [RFC][BZ #16009] Memory handling in strxfrm_l()
@ 2014-11-20 11:10 Leonhard Holz
  2014-11-21  4:03 ` Rich Felker
  2014-11-23 16:09 ` Ondřej Bílka
  0 siblings, 2 replies; 10+ messages in thread
From: Leonhard Holz @ 2014-11-20 11:10 UTC (permalink / raw)
  To: libc-alpha

Hello everybody,

by reading through the strxfrm_l function to solve a possible buffer 
overflow bug (https://sourceware.org/bugzilla/show_bug.cgi?id=16009) I 
found some other improvable things and would like to have some feedback 
how to proceed.

1. Line 133: Empty string handling. This could be placed at the very 
start of the function. Instead of determining the strlen() of the input 
one could check if the first byte is \0 -> faster special path.

2. Line 151: Size of needed buffer for weights and rule cache. This is 
calculated as (srclen + 1) * (sizeof (int32_t) + 1). Since srclen and 
the malloc parameter are both size_t this will cause an integer overflow 
for large enough strings and result in a too-small buffer which is 
written beyond its boundary (this is actually bug 16009).

3. Lines 153, 169, 170: Repetition of buffer size calculation. I would 
like to introduce a variable to calculate it just ones.

4. Line 156: Handling of failed malloc(). If malloc fails strxfrm_l 
tries to allocate the memory on the stack. This looks a bit weird to me 
since if the needed memory is too large for the heap then it's probably 
too large for the stack. Also alloca has a bad fail behaviour, best case 
it terminates the program with a stack overflow error, otherwise the 
"program behavior is undefined" (man page). So I think one should avoid 
allocation attempts larger than __libc_use_alloca recommends in any case.

5. Handling of failed memory allocation. Since the last ressort in the 
current implementation is a stack overflow there is currently no 
"function could not compute" path. I'd like to add that in case the 
needed buffer is too large at all (regarding size_t) or too large for 
alloca + malloc fails. For the return value the man page states: "The 
strxfrm() function returns the number of bytes required to store the 
transformed string in dest excluding the terminating null byte ('\0'). 
If the value returned is n or more, the contents of dest are 
indeterminate." - Does that imply that returning n is a good idea to 
communicate an error?

6. Use the given parameter n instead of srclen = strlen(src) for 
computing the buffer size. Since only n bytes are written into dst the 
number of weights and rules to cache should be limited by this parameter 
and calling strlen could be avoided. Does this sound reasonable?

Best,
Leonhard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
  2014-11-20 11:10 [RFC][BZ #16009] Memory handling in strxfrm_l() Leonhard Holz
@ 2014-11-21  4:03 ` Rich Felker
  2014-11-21 10:23   ` Leonhard Holz
  2014-11-23 16:09 ` Ondřej Bílka
  1 sibling, 1 reply; 10+ messages in thread
From: Rich Felker @ 2014-11-21  4:03 UTC (permalink / raw)
  To: Leonhard Holz; +Cc: libc-alpha

On Thu, Nov 20, 2014 at 12:10:29PM +0100, Leonhard Holz wrote:
> 4. Line 156: Handling of failed malloc(). If malloc fails strxfrm_l
> tries to allocate the memory on the stack. This looks a bit weird to
> me since if the needed memory is too large for the heap then it's
> probably too large for the stack. Also alloca has a bad fail
> behaviour, best case it terminates the program with a stack overflow
> error, otherwise the "program behavior is undefined" (man page). So
> I think one should avoid allocation attempts larger than
> __libc_use_alloca recommends in any case.

Indeed, this is a serious bug.

> 5. Handling of failed memory allocation. Since the last ressort in
> the current implementation is a stack overflow there is currently no
> "function could not compute" path. I'd like to add that in case the
> needed buffer is too large at all (regarding size_t) or too large
> for alloca + malloc fails. For the return value the man page states:
> "The strxfrm() function returns the number of bytes required to
> store the transformed string in dest excluding the terminating null
> byte ('\0'). If the value returned is n or more, the contents of
> dest are indeterminate." - Does that imply that returning n is a
> good idea to communicate an error?

No, the function is not permitted to return an error; it's required by
ISO C to produce a result. Falsely reporting that it needs more space
for the result, and thereby causing the caller to keep allocating
larger and larger buffers until it runs out of memory itself, is not
valid; in particular, it could report different needed lengths for the
same string at different calls in the ame program with the same
locale.

If strcoll_l is using an algorithm that requires allocation, this
needs to be fixed -- there's no fundamental reason it needs to
allocate.

> 6. Use the given parameter n instead of srclen = strlen(src) for
> computing the buffer size. Since only n bytes are written into dst
> the number of weights and rules to cache should be limited by this
> parameter and calling strlen could be avoided. Does this sound
> reasonable?

I'm not sure about this; I think you should check carefully if it's
actually valid.

Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
  2014-11-21  4:03 ` Rich Felker
@ 2014-11-21 10:23   ` Leonhard Holz
  2014-11-21 17:22     ` Joseph Myers
  2014-11-22 23:39     ` Ondřej Bílka
  0 siblings, 2 replies; 10+ messages in thread
From: Leonhard Holz @ 2014-11-21 10:23 UTC (permalink / raw)
  To: libc-alpha

Am 21.11.2014 05:03, schrieb Rich Felker:
> On Thu, Nov 20, 2014 at 12:10:29PM +0100, Leonhard Holz wrote:
>> 4. Line 156: Handling of failed malloc(). If malloc fails strxfrm_l
>> tries to allocate the memory on the stack. This looks a bit weird to
>> me since if the needed memory is too large for the heap then it's
>> probably too large for the stack. Also alloca has a bad fail
>> behaviour, best case it terminates the program with a stack overflow
>> error, otherwise the "program behavior is undefined" (man page). So
>> I think one should avoid allocation attempts larger than
>> __libc_use_alloca recommends in any case.
>
> Indeed, this is a serious bug.
>

I took a look at __libc_use_alloca and it's just

extern inline int __libc_use_alloca (size_t size)
{
   return size <= __MAX_ALLOCA_CUTOFF;
}

Should'nt it consider the currently used stack size to see if there is 
enough stack memory left? Or do I have a too simple idea of OS stack 
memory handling...

>> 5. Handling of failed memory allocation. Since the last ressort in
>> the current implementation is a stack overflow there is currently no
>> "function could not compute" path. I'd like to add that in case the
>> needed buffer is too large at all (regarding size_t) or too large
>> for alloca + malloc fails. For the return value the man page states:
>> "The strxfrm() function returns the number of bytes required to
>> store the transformed string in dest excluding the terminating null
>> byte ('\0'). If the value returned is n or more, the contents of
>> dest are indeterminate." - Does that imply that returning n is a
>> good idea to communicate an error?
>
> No, the function is not permitted to return an error; it's required by
> ISO C to produce a result. Falsely reporting that it needs more space
> for the result, and thereby causing the caller to keep allocating
> larger and larger buffers until it runs out of memory itself, is not
> valid; in particular, it could report different needed lengths for the
> same string at different calls in the ame program with the same
> locale.
>
> If strcoll_l is using an algorithm that requires allocation, this
> needs to be fixed -- there's no fundamental reason it needs to
> allocate.
>

Ok. It is no big deal to add a none-allocating path but the question 
than is when to use it. We could stick to the current implementation and 
just try to malloc() if the stack is not available but personally I 
would not want strxfrm to even try to allocate memory beyond a certain 
amount. Considering that __MAX_ALLOCA_CUTOFF is actually 64KB so that 
strings up to 12.8KB could have a stack based index & rules cache one 
could maybe avoid malloc() at all without hurting most real world use cases.

So there are six possible scenarios:

1. Do not allocate a cache. This slows performance but avoids a call to 
strlen() and duplicate code (cached & uncached version).

2. Do allocate a cache on the stack only.

3. Do allocate on the stack first and otherwise malloc.

4. Do allocate on the stack first and otherwise malloc if the cache size 
is below a treshold.

5. Allocate the cache via malloc only.

6. Allocate the cache via malloc only if the size is below a treshold.

Leonhard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
  2014-11-21 10:23   ` Leonhard Holz
@ 2014-11-21 17:22     ` Joseph Myers
  2014-11-22 23:39     ` Ondřej Bílka
  1 sibling, 0 replies; 10+ messages in thread
From: Joseph Myers @ 2014-11-21 17:22 UTC (permalink / raw)
  To: Leonhard Holz; +Cc: libc-alpha

On Fri, 21 Nov 2014, Leonhard Holz wrote:

> I took a look at __libc_use_alloca and it's just
> 
> extern inline int __libc_use_alloca (size_t size)
> {
>   return size <= __MAX_ALLOCA_CUTOFF;
> }
> 
> Should'nt it consider the currently used stack size to see if there is enough
> stack memory left? Or do I have a too simple idea of OS stack memory
> handling...

The security requirement is simply that pages up to __MAX_ALLOCA_CUTOFF 
(plus non-alloca stack requirements) beyond the current stack pointer must 
either be available for the stack, or must be unmapped so that attempted 
writes to them result in a fatal signal - that alloca can't overflow from 
the stack into other allocated and writable memory regions.

I don't know what guarantees are provided in this regard by the Linux 
kernel / glibc's allocation of thread stacks.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
  2014-11-21 10:23   ` Leonhard Holz
  2014-11-21 17:22     ` Joseph Myers
@ 2014-11-22 23:39     ` Ondřej Bílka
  2014-11-24  9:26       ` Leonhard Holz
  1 sibling, 1 reply; 10+ messages in thread
From: Ondřej Bílka @ 2014-11-22 23:39 UTC (permalink / raw)
  To: Leonhard Holz; +Cc: libc-alpha

On Fri, Nov 21, 2014 at 11:23:31AM +0100, Leonhard Holz wrote:
> 
> Am 21.11.2014 05:03, schrieb Rich Felker:
> >On Thu, Nov 20, 2014 at 12:10:29PM +0100, Leonhard Holz wrote:
> >>4. Line 156: Handling of failed malloc(). If malloc fails strxfrm_l
> >>tries to allocate the memory on the stack. This looks a bit weird to
> >>me since if the needed memory is too large for the heap then it's
> >>probably too large for the stack. Also alloca has a bad fail
> >>behaviour, best case it terminates the program with a stack overflow
> >>error, otherwise the "program behavior is undefined" (man page). So
> >>I think one should avoid allocation attempts larger than
> >>__libc_use_alloca recommends in any case.
> >
> >Indeed, this is a serious bug.
> >
> 
> I took a look at __libc_use_alloca and it's just
> 
> extern inline int __libc_use_alloca (size_t size)
> {
>   return size <= __MAX_ALLOCA_CUTOFF;
> }
> 
> Should'nt it consider the currently used stack size to see if there
> is enough stack memory left? Or do I have a too simple idea of OS
> stack memory handling...
> 
> >>5. Handling of failed memory allocation. Since the last ressort in
> >>the current implementation is a stack overflow there is currently no
> >>"function could not compute" path. I'd like to add that in case the
> >>needed buffer is too large at all (regarding size_t) or too large
> >>for alloca + malloc fails. For the return value the man page states:
> >>"The strxfrm() function returns the number of bytes required to
> >>store the transformed string in dest excluding the terminating null
> >>byte ('\0'). If the value returned is n or more, the contents of
> >>dest are indeterminate." - Does that imply that returning n is a
> >>good idea to communicate an error?
> >
> >No, the function is not permitted to return an error; it's required by
> >ISO C to produce a result. Falsely reporting that it needs more space
> >for the result, and thereby causing the caller to keep allocating
> >larger and larger buffers until it runs out of memory itself, is not
> >valid; in particular, it could report different needed lengths for the
> >same string at different calls in the ame program with the same
> >locale.
> >
> >If strcoll_l is using an algorithm that requires allocation, this
> >needs to be fixed -- there's no fundamental reason it needs to
> >allocate.
> >
> 
> Ok. It is no big deal to add a none-allocating path but the question
> than is when to use it. We could stick to the current implementation
> and just try to malloc() if the stack is not available but
> personally I would not want strxfrm to even try to allocate memory
> beyond a certain amount. Considering that __MAX_ALLOCA_CUTOFF is
> actually 64KB so that strings up to 12.8KB could have a stack based
> index & rules cache one could maybe avoid malloc() at all without
> hurting most real world use cases.
> 
> So there are six possible scenarios:
> 
> 1. Do not allocate a cache. This slows performance but avoids a call
> to strlen() and duplicate code (cached & uncached version).
> 
> 2. Do allocate a cache on the stack only.
> 
> 3. Do allocate on the stack first and otherwise malloc.
> 
> 4. Do allocate on the stack first and otherwise malloc if the cache
> size is below a treshold.
> 
> 5. Allocate the cache via malloc only.
> 
> 6. Allocate the cache via malloc only if the size is below a treshold.
> 

You could also only cache last 16k characters on stack and if function
goes beyond that then recompute these / switch to uncached version.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
  2014-11-20 11:10 [RFC][BZ #16009] Memory handling in strxfrm_l() Leonhard Holz
  2014-11-21  4:03 ` Rich Felker
@ 2014-11-23 16:09 ` Ondřej Bílka
  1 sibling, 0 replies; 10+ messages in thread
From: Ondřej Bílka @ 2014-11-23 16:09 UTC (permalink / raw)
  To: Leonhard Holz; +Cc: libc-alpha

Some additional comments to OP.

On Thu, Nov 20, 2014 at 12:10:29PM +0100, Leonhard Holz wrote:
> Hello everybody,
> 
> by reading through the strxfrm_l function to solve a possible buffer
> overflow bug (https://sourceware.org/bugzilla/show_bug.cgi?id=16009)
> I found some other improvable things and would like to have some
> feedback how to proceed.
> 
> 1. Line 133: Empty string handling. This could be placed at the very
> start of the function. Instead of determining the strlen() of the
> input one could check if the first byte is \0 -> faster special
> path.
> 
That would not likely help as probability of transforming empty strings 
is close to zero. It could harm nonempty case as for first byte check
you need to do load+compare instruction versus only compare after
strlen.

Second point here is that speed is not primary concern as you use this
to replace many strcoll calls with strcmp.

> 
> 6. Use the given parameter n instead of srclen = strlen(src) for
> computing the buffer size. Since only n bytes are written into dst
> the number of weights and rules to cache should be limited by this
> parameter and calling strlen could be avoided. Does this sound
> reasonable?
> 
You cannot avoid strlen as you need to return size of converted string.
It is hard to compute exact length as to save space we compress weights
with utf8 encoding. However it should be possible to calculate size
without using cache.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
  2014-11-22 23:39     ` Ondřej Bílka
@ 2014-11-24  9:26       ` Leonhard Holz
  2014-11-24 18:40         ` Paul Eggert
  0 siblings, 1 reply; 10+ messages in thread
From: Leonhard Holz @ 2014-11-24  9:26 UTC (permalink / raw)
  To: libc-alpha

>>>
>>> No, the function is not permitted to return an error; it's required by
>>> ISO C to produce a result. Falsely reporting that it needs more space
>>> for the result, and thereby causing the caller to keep allocating
>>> larger and larger buffers until it runs out of memory itself, is not
>>> valid; in particular, it could report different needed lengths for the
>>> same string at different calls in the ame program with the same
>>> locale.
>>>
>>> If strcoll_l is using an algorithm that requires allocation, this
>>> needs to be fixed -- there's no fundamental reason it needs to
>>> allocate.
>>>
>>
>> Ok. It is no big deal to add a none-allocating path but the question
>> than is when to use it. We could stick to the current implementation
>> and just try to malloc() if the stack is not available but
>> personally I would not want strxfrm to even try to allocate memory
>> beyond a certain amount. Considering that __MAX_ALLOCA_CUTOFF is
>> actually 64KB so that strings up to 12.8KB could have a stack based
>> index & rules cache one could maybe avoid malloc() at all without
>> hurting most real world use cases.
>>
>
> You could also only cache last 16k characters on stack and if function
> goes beyond that then recompute these / switch to uncached version.
>

Thank you all for the feedback. There are two things I overlooked: 
strxfrm needs to compute the whole src string because it has to return 
the needed dest length in any case and the weight-indices-cache is 
modified while traversing the string. So it's not possible to use a 
sliding-window-approach or restrict the cache size based on dest length.

I also agree that strxfrm is a function for pre-computing things that 
need to be fast somewhere else, so performance has not the highest 
priority. Anyway, the "faster" approach is implemented so why not reuse it.

My proposal now is the following:

* allocate a fixed size cache array on the stack (e.g. 20kb supporting 
strings up to 4000 characters)
* fill it with values until either the end of the string is reached or 
the cache is full
* go with the cached version if end of string is reached
* go with the uncached version if not

This avoids strlen() + malloc() and is "fast" for standard real world 
issues like word sorting and solid for large strings.

Leonhard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
  2014-11-24  9:26       ` Leonhard Holz
@ 2014-11-24 18:40         ` Paul Eggert
  2014-11-25 16:21           ` Ondřej Bílka
  2014-11-25 20:20           ` Leonhard Holz
  0 siblings, 2 replies; 10+ messages in thread
From: Paul Eggert @ 2014-11-24 18:40 UTC (permalink / raw)
  To: Leonhard Holz, libc-alpha

On 11/24/2014 01:26 AM, Leonhard Holz wrote:
> strxfrm is a function for pre-computing things that need to be fast 
> somewhere else

That may be true in theory, but in practice strxfrm is pretty much 
useless everywhere.  When I tried to use it in GNU sort, I gave up in 
frustration, as strxfrm was soooo sllooooow that it was much better to 
simply use strcoll and be smart about it.  I don't know of any practical 
application that uses glibc strxfrm in a real way, and my advice for 
optimizating strxfrm is to not bother, and to focus one's efforts on 
making strcoll go faster.

Unless perhaps you're thinking of rewriting strxfrm from scratch, in 
which case we can talk about what's needed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
  2014-11-24 18:40         ` Paul Eggert
@ 2014-11-25 16:21           ` Ondřej Bílka
  2014-11-25 20:20           ` Leonhard Holz
  1 sibling, 0 replies; 10+ messages in thread
From: Ondřej Bílka @ 2014-11-25 16:21 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Leonhard Holz, libc-alpha

On Mon, Nov 24, 2014 at 10:40:51AM -0800, Paul Eggert wrote:
> On 11/24/2014 01:26 AM, Leonhard Holz wrote:
> >strxfrm is a function for pre-computing things that need to be
> >fast somewhere else
> 
> That may be true in theory, but in practice strxfrm is pretty much
> useless everywhere.  When I tried to use it in GNU sort, I gave up
> in frustration, as strxfrm was soooo sllooooow that it was much
> better to simply use strcoll and be smart about it.  I don't know of
> any practical application that uses glibc strxfrm in a real way, and
> my advice for optimizating strxfrm is to not bother, and to focus
> one's efforts on making strcoll go faster.
> 
> Unless perhaps you're thinking of rewriting strxfrm from scratch, in
> which case we can talk about what's needed.

A strxfrm is flawed design as most of time one needs to know just first
ten characters to do comparison. As one cannot generate small prefix
because strxfrm needs to report size of entire string and content is
unspecificed by standard it needs to do lot of unnecessary work.

With strcoll improvement that I posted earlier a possible strcoll speedup 
is quite big, when I tested a time of 'find | sort' command sort runs
around ten times faster. However is a use case with long common prefixes
which gives that speedup, it would be less when sorting english strings.

What would help is add another function, say strtrim. That function
would delete characters ignored by comparison by string. That would fix
weakness of my approach where comparing long prefixes that differ only
in ignored characters is still slow.

As sorting goes we should add a dedicated algorithm for strcoll, one
wants to use radixsort which is possible but it needs to access locale
specification to get weigths and deal with specific cases.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][BZ #16009] Memory handling in strxfrm_l()
  2014-11-24 18:40         ` Paul Eggert
  2014-11-25 16:21           ` Ondřej Bílka
@ 2014-11-25 20:20           ` Leonhard Holz
  1 sibling, 0 replies; 10+ messages in thread
From: Leonhard Holz @ 2014-11-25 20:20 UTC (permalink / raw)
  To: libc-alpha

Am 24.11.2014 19:40, schrieb Paul Eggert:
> On 11/24/2014 01:26 AM, Leonhard Holz wrote:
>> strxfrm is a function for pre-computing things that need to be fast
>> somewhere else
>
> That may be true in theory, but in practice strxfrm is pretty much
> useless everywhere.  When I tried to use it in GNU sort, I gave up in
> frustration, as strxfrm was soooo sllooooow that it was much better to
> simply use strcoll and be smart about it.  I don't know of any practical
> application that uses glibc strxfrm in a real way, and my advice for
> optimizating strxfrm is to not bother, and to focus one's efforts on
> making strcoll go faster.
>
> Unless perhaps you're thinking of rewriting strxfrm from scratch, in
> which case we can talk about what's needed.

Indeed it's not that easy to imagine a use case for strxfrm as storing 
the result of comparison (e.g. as an ordered database index) is probably 
always better than doubling or tripling the needed space just to make 
comparison faster. Particularly if you consider that different strings 
are likely to differ after a few bytes independent of there overall length.

Anyhow the security issue in strxfrm has to be fixed. ;)

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-11-25 20:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-20 11:10 [RFC][BZ #16009] Memory handling in strxfrm_l() Leonhard Holz
2014-11-21  4:03 ` Rich Felker
2014-11-21 10:23   ` Leonhard Holz
2014-11-21 17:22     ` Joseph Myers
2014-11-22 23:39     ` Ondřej Bílka
2014-11-24  9:26       ` Leonhard Holz
2014-11-24 18:40         ` Paul Eggert
2014-11-25 16:21           ` Ondřej Bílka
2014-11-25 20:20           ` Leonhard Holz
2014-11-23 16:09 ` Ondřej Bílka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).