public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* intl: plural form evaluation crashes — worth fixing?
@ 2023-10-06 15:19 Bruno Haible
  2023-10-06 16:44 ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 4+ messages in thread
From: Bruno Haible @ 2023-10-06 15:19 UTC (permalink / raw)
  To: libc-alpha

Hi,

There are two ways that a malicious PO file, after compilation to a .mo file
through 'msgfmt -c', can lead to a crash in the program that uses it. The
crash occurs in glibc code, not in application code.
(Note that 'msgfmt -c' makes all sorts of checks on format strings, to avoid
that application code uses *printf() functions with wrongly typed arguments.
But here the crash is in glibc; no format strings are involved.)

I have fixed both issues in GNU gettext's libintl. Should I also submit them
for fix in glibc, here?

Case 1: Crash by stack overflow.
The test case is in [1]. It involves a 'plural' form expression with many
parentheses. It is only observable when the stack size has been reduced
from the default (8 MB) to sizes <= 260 KB. As far as I understand, stack
sizes in the range of 100 KB to 250 KB are sometimes used, especially for
programs that create many threads.
The fix is in [2]. It avoids the stack overflow by tracking the recursion
depth. This costs a little bit of speed. But since a plural form expression
is typically small, the slowdown will usually be just a few dozen instructions.

Case 2: Crash by SIGFPE.
The test case is in [3]. It involves a 'plural' form expression that
triggers an integer division by zero in a range that 'msgfmt -c' has not
checked (since it cannot check infinitely many values).
The fix is in [4]. It avoids the integer division by zero through a runtime
check. This costs a little bit of speed, but typically just 1 or 2
instructions per division.

I'm asking because it's a trade-off between reliability and speed (with
major impact on reliability, in the presence of a malicious actor, but only
a tiny impact on speed).

Please CC me in your replies, as I am not subscribed to the list.

Bruno

[1] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=021348871a22041cd9d2625ab958dd4808fd282d
[2] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=ef37a15408b285976978fa0d1e105ef97e23fa59
[3] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=429ba6c6b835756ceb83f4396580ad5abe36abcf
[4] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=726bfb1d1c5a31b75250c3c9ccc8894195332d63




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: intl: plural form evaluation crashes — worth fixing?
  2023-10-06 15:19 intl: plural form evaluation crashes — worth fixing? Bruno Haible
@ 2023-10-06 16:44 ` Adhemerval Zanella Netto
  2023-10-06 17:51   ` Bruno Haible
  0 siblings, 1 reply; 4+ messages in thread
From: Adhemerval Zanella Netto @ 2023-10-06 16:44 UTC (permalink / raw)
  To: Bruno Haible, libc-alpha



On 06/10/23 12:19, Bruno Haible wrote:
> Hi,
> 
> There are two ways that a malicious PO file, after compilation to a .mo file
> through 'msgfmt -c', can lead to a crash in the program that uses it. The
> crash occurs in glibc code, not in application code.
> (Note that 'msgfmt -c' makes all sorts of checks on format strings, to avoid
> that application code uses *printf() functions with wrongly typed arguments.
> But here the crash is in glibc; no format strings are involved.)
> 
> I have fixed both issues in GNU gettext's libintl. Should I also submit them
> for fix in glibc, here?
> 
> Case 1: Crash by stack overflow.
> The test case is in [1]. It involves a 'plural' form expression with many
> parentheses. It is only observable when the stack size has been reduced
> from the default (8 MB) to sizes <= 260 KB. As far as I understand, stack
> sizes in the range of 100 KB to 250 KB are sometimes used, especially for
> programs that create many threads.
> The fix is in [2]. It avoids the stack overflow by tracking the recursion
> depth. This costs a little bit of speed. But since a plural form expression
> is typically small, the slowdown will usually be just a few dozen instructions.
> 
> Case 2: Crash by SIGFPE.
> The test case is in [3]. It involves a 'plural' form expression that
> triggers an integer division by zero in a range that 'msgfmt -c' has not
> checked (since it cannot check infinitely many values).
> The fix is in [4]. It avoids the integer division by zero through a runtime
> check. This costs a little bit of speed, but typically just 1 or 2
> instructions per division.
> 
> I'm asking because it's a trade-off between reliability and speed (with
> major impact on reliability, in the presence of a malicious actor, but only
> a tiny impact on speed).
> 
> Please CC me in your replies, as I am not subscribed to the list.
> 
> Bruno
> 
> [1] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=021348871a22041cd9d2625ab958dd4808fd282d
> [2] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=ef37a15408b285976978fa0d1e105ef97e23fa59
> [3] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=429ba6c6b835756ceb83f4396580ad5abe36abcf
> [4] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=commitdiff;h=726bfb1d1c5a31b75250c3c9ccc8894195332d63

Yes, I think both issues are worth fixing.  In fact, building dcigettext.c
with -fstack-usage show that plural_eval uses about 48 bytes and with
EVAL_MAXDEPTH being 100 it means about worst case of ~4.8MB which is
still quite large.  Do we really EVAL_MAXDEPTH set to 100? Can we limit
even more to something more sane, like about 10/20?

For 2, issue raise by a library if far from ideal and it would be good
that we can remove it.

If you can create regression tests it would be really helpful as well.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: intl: plural form evaluation crashes — worth fixing?
  2023-10-06 16:44 ` Adhemerval Zanella Netto
@ 2023-10-06 17:51   ` Bruno Haible
  2023-10-06 18:00     ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 4+ messages in thread
From: Bruno Haible @ 2023-10-06 17:51 UTC (permalink / raw)
  To: libc-alpha, Adhemerval Zanella Netto

Adhemerval Zanella Netto wrote:
> Yes, I think both issues are worth fixing.

OK, I'll then enter them in bugzilla and prepare patches.
Can you please give me edit access (not only creation and comment access) to
bugzilla, per https://sourceware.org/glibc/wiki/Bugzilla%20Procedures § B0 ?

> In fact, building dcigettext.c
> with -fstack-usage show that plural_eval uses about 48 bytes and with
> EVAL_MAXDEPTH being 100 it means about worst case of ~4.8MB which is
> still quite large.

No no. 48 bytes/frame * 100 frames = 4.8 KB, not 4.8 MB. Even threads on
very small systems have 16 KB at least.

> Do we really EVAL_MAXDEPTH set to 100?

musl libc, which is used in "smaller" systems than glibc, has set it to 100
too.

Bruno




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: intl: plural form evaluation crashes — worth fixing?
  2023-10-06 17:51   ` Bruno Haible
@ 2023-10-06 18:00     ` Adhemerval Zanella Netto
  0 siblings, 0 replies; 4+ messages in thread
From: Adhemerval Zanella Netto @ 2023-10-06 18:00 UTC (permalink / raw)
  To: Bruno Haible, libc-alpha



On 06/10/23 14:51, Bruno Haible wrote:
> Adhemerval Zanella Netto wrote:
>> Yes, I think both issues are worth fixing.
> 
> OK, I'll then enter them in bugzilla and prepare patches.
> Can you please give me edit access (not only creation and comment access) to
> bugzilla, per https://sourceware.org/glibc/wiki/Bugzilla%20Procedures § B0 ?
> 
>> In fact, building dcigettext.c
>> with -fstack-usage show that plural_eval uses about 48 bytes and with
>> EVAL_MAXDEPTH being 100 it means about worst case of ~4.8MB which is
>> still quite large.
> 
> No no. 48 bytes/frame * 100 frames = 4.8 KB, not 4.8 MB. Even threads on
> very small systems have 16 KB at least.

Oops, yeah I mean KB indeed.  But it is still somewhat large for a *single*
function call (recall that we don't when in the stackframe the thread will 
ending calling the libc).

> 
>> Do we really EVAL_MAXDEPTH set to 100?
> 
> musl libc, which is used in "smaller" systems than glibc, has set it to 100
> too.

Fair enough.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-10-06 18:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-06 15:19 intl: plural form evaluation crashes — worth fixing? Bruno Haible
2023-10-06 16:44 ` Adhemerval Zanella Netto
2023-10-06 17:51   ` Bruno Haible
2023-10-06 18:00     ` Adhemerval Zanella Netto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).