From: Iain Buclaw <ibuclaw@gdcproject.org>
To: Alan Modra <amodra@gmail.com>, gcc-patches@gcc.gnu.org
Cc: Ian Lance Taylor <ian@airs.com>
Subject: Re: ubsan: d-demangle.c:214 signed integer overflow
Date: Thu, 03 Sep 2020 23:02:50 +0200 [thread overview]
Message-ID: <1599163400.8r2ly1k30n.astroid@galago.none> (raw)
In-Reply-To: <20200903130116.GQ15695@bubble.grove.modra.org>
Excerpts from Alan Modra's message of September 3, 2020 3:01 pm:
> Running the libiberty testsuite
> ./test-demangle < libiberty/testsuite/d-demangle-expected
> libiberty/d-demangle.c:214:14: runtime error: signed integer overflow: 922337203 * 10 cannot be represented in type 'long int'
>
> On looking at silencing ubsan, I found a real bug in dlang_number.
> For a 32-bit long, some overflows won't be detected. For example,
> 21474836480. Why? Well 214748364 * 10 is 0x7FFFFFF8 (no overflow so
> far). Adding 8 gives 0x80000000 (which does overflow but there is no
> test for that overflow in the code). Then multiplying 0x80000000 * 10
> = 0x500000000 = 0 won't be caught by the multiplication overflow test.
> The same holds for a 64-bit long using similarly crafted digit
> sequences.
>
> This patch replaces the mod 10 test with a simpler limit test, and
> similarly the mod 26 test in dlang_decode_backref.
>
> About the limit test:
> val * 10 + digit > ULONG_MAX is the condition for overflow
> ie.
> val * 10 > ULONG_MAX - digit
> or
> val > (ULONG_MAX - digit) / 10
> or assuming the largest digit
> val > (ULONG_MAX - 9) / 10
>
> I resisted the aesthetic appeal of simplifying this further to
> val > -10UL / 10
> since -1UL for ULONG_MAX is only correct for 2's complement numbers.
>
> Passes all the libiberty tests, on both 32-bit and 64-bit hosts. OK
> to apply?
>
Thanks Alan, change seems reasonable, however on giving it a mull over,
I see that the largest number that dlang_number would need to be able to
handle is UINT_MAX. These two tests which decode a wchar value are
representative of that (first is valid, second invalid).
#
--format=dlang
_D4test21__T3funVwi4294967295Z3funFNaNbNiNfZv
test.fun!('\Uffffffff').fun()
#
--format=dlang
_D4test21__T3funVwi4294967296Z3funFNaNbNiNfZv
_D4test21__T3funVwi4294967296Z3funFNaNbNiNfZv
I'm fine with creating a new PR and dealing with the above in a separate
change though, as it will require a few more replacements to adjust the
result parameter type to 'unsigned' or 'long long'.
Iain.
> * d-demangle.c: Include limits.h.
> (ULONG_MAX): Provide fall-back definition.
> (dlang_number): Simplify and correct overflow test. Only
> write *ret on returning non-NULL.
> (dlang_decode_backref): Likewise.
>
> diff --git a/libiberty/d-demangle.c b/libiberty/d-demangle.c
> index f2d6946eca..59e6ae007a 100644
> --- a/libiberty/d-demangle.c
> +++ b/libiberty/d-demangle.c
> @@ -31,6 +31,9 @@ If not, see <http://www.gnu.org/licenses/>. */
> #ifdef HAVE_CONFIG_H
> #include "config.h"
> #endif
> +#ifdef HAVE_LIMITS_H
> +#include <limits.h>
> +#endif
>
> #include "safe-ctype.h"
>
> @@ -45,6 +48,10 @@ If not, see <http://www.gnu.org/licenses/>. */
> #include <demangle.h>
> #include "libiberty.h"
>
> +#ifndef ULONG_MAX
> +#define ULONG_MAX (~0UL)
> +#endif
> +
> /* A mini string-handling package */
>
> typedef struct string /* Beware: these aren't required to be */
> @@ -207,24 +214,24 @@ dlang_number (const char *mangled, long *ret)
> if (mangled == NULL || !ISDIGIT (*mangled))
> return NULL;
>
> - (*ret) = 0;
> + unsigned long val = 0;
>
> while (ISDIGIT (*mangled))
> {
> - (*ret) *= 10;
> -
> - /* If an overflow occured when multiplying by ten, the result
> - will not be a multiple of ten. */
> - if ((*ret % 10) != 0)
> + /* Check for overflow. Yes, we return NULL here for some digits
> + that don't overflow "val * 10 + digit", but that doesn't
> + matter given the later "(long) val < 0" test. */
> + if (val > (ULONG_MAX - 9) / 10)
> return NULL;
>
> - (*ret) += mangled[0] - '0';
> + val = val * 10 + mangled[0] - '0';
> mangled++;
> }
>
> - if (*mangled == '\0' || *ret < 0)
> + if (*mangled == '\0' || (long) val < 0)
> return NULL;
>
> + *ret = val;
> return mangled;
> }
>
> @@ -294,24 +301,24 @@ dlang_decode_backref (const char *mangled, long *ret)
> [A-Z] NumberBackRef
> ^
> */
> - (*ret) = 0;
> + unsigned long val = 0;
>
> while (ISALPHA (*mangled))
> {
> - (*ret) *= 26;
> + /* Check for overflow. */
> + if (val > (ULONG_MAX - 25) / 26)
> + break;
>
> - /* If an overflow occured when multiplying by 26, the result
> - will not be a multiple of 26. */
> - if ((*ret % 26) != 0)
> - return NULL;
> + val *= 26;
>
> if (mangled[0] >= 'a' && mangled[0] <= 'z')
> {
> - (*ret) += mangled[0] - 'a';
> + val += mangled[0] - 'a';
> + *ret = val;
> return mangled + 1;
> }
>
> - (*ret) += mangled[0] - 'A';
> + val += mangled[0] - 'A';
> mangled++;
> }
>
> --
> Alan Modra
> Australia Development Lab, IBM
>
next prev parent reply other threads:[~2020-09-03 21:02 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-03 13:01 Alan Modra
2020-09-03 21:02 ` Iain Buclaw [this message]
2020-09-04 0:59 ` Alan Modra
2020-09-04 11:22 ` Iain Buclaw
2020-09-04 13:34 ` Alan Modra
2020-09-04 16:23 ` Iain Buclaw
2020-09-07 0:56 ` Alan Modra
2020-09-07 16:17 ` Iain Buclaw
2020-09-07 17:46 ` Ian Lance Taylor
2020-11-13 19:04 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1599163400.8r2ly1k30n.astroid@galago.none \
--to=ibuclaw@gdcproject.org \
--cc=amodra@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=ian@airs.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).