Re: ubsan: d-demangle.c:214 signed integer overflow

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Iain Buclaw <ibuclaw@gdcproject.org>
To: Alan Modra <amodra@gmail.com>, gcc-patches@gcc.gnu.org
Cc: Ian Lance Taylor <ian@airs.com>
Subject: Re: ubsan: d-demangle.c:214 signed integer overflow
Date: Thu, 03 Sep 2020 23:02:50 +0200	[thread overview]
Message-ID: <1599163400.8r2ly1k30n.astroid@galago.none> (raw)
In-Reply-To: <20200903130116.GQ15695@bubble.grove.modra.org>

Excerpts from Alan Modra's message of September 3, 2020 3:01 pm:
> Running the libiberty testsuite
> ./test-demangle < libiberty/testsuite/d-demangle-expected
> libiberty/d-demangle.c:214:14: runtime error: signed integer overflow: 922337203 * 10 cannot be represented in type 'long int'
> 
> On looking at silencing ubsan, I found a real bug in dlang_number.
> For a 32-bit long, some overflows won't be detected.  For example,
> 21474836480.  Why?  Well 214748364 * 10 is 0x7FFFFFF8 (no overflow so
> far).  Adding 8 gives 0x80000000 (which does overflow but there is no
> test for that overflow in the code).  Then multiplying 0x80000000 * 10
> = 0x500000000 = 0 won't be caught by the multiplication overflow test.
> The same holds for a 64-bit long using similarly crafted digit
> sequences.
> 
> This patch replaces the mod 10 test with a simpler limit test, and
> similarly the mod 26 test in dlang_decode_backref.
> 
> About the limit test:
>   val * 10 + digit > ULONG_MAX is the condition for overflow
> ie.
>   val * 10 > ULONG_MAX - digit
> or
>   val > (ULONG_MAX - digit) / 10
> or assuming the largest digit
>   val > (ULONG_MAX - 9) / 10
> 
> I resisted the aesthetic appeal of simplifying this further to
>   val > -10UL / 10
> since -1UL for ULONG_MAX is only correct for 2's complement numbers.
> 
> Passes all the libiberty tests, on both 32-bit and 64-bit hosts.  OK
> to apply?
> 

Thanks Alan, change seems reasonable, however on giving it a mull over,
I see that the largest number that dlang_number would need to be able to
handle is UINT_MAX.  These two tests which decode a wchar value are
representative of that (first is valid, second invalid).

#
--format=dlang
_D4test21__T3funVwi4294967295Z3funFNaNbNiNfZv
test.fun!('\Uffffffff').fun()
#
--format=dlang
_D4test21__T3funVwi4294967296Z3funFNaNbNiNfZv
_D4test21__T3funVwi4294967296Z3funFNaNbNiNfZv

I'm fine with creating a new PR and dealing with the above in a separate
change though, as it will require a few more replacements to adjust the
result parameter type to 'unsigned' or 'long long'.  

Iain.


> 	* d-demangle.c: Include limits.h.
> 	(ULONG_MAX): Provide fall-back definition.
> 	(dlang_number): Simplify and correct overflow test.  Only
> 	write *ret on returning non-NULL.
> 	(dlang_decode_backref): Likewise.
> 
> diff --git a/libiberty/d-demangle.c b/libiberty/d-demangle.c
> index f2d6946eca..59e6ae007a 100644
> --- a/libiberty/d-demangle.c
> +++ b/libiberty/d-demangle.c
> @@ -31,6 +31,9 @@ If not, see <http://www.gnu.org/licenses/>.  */
>  #ifdef HAVE_CONFIG_H
>  #include "config.h"
>  #endif
> +#ifdef HAVE_LIMITS_H
> +#include <limits.h>
> +#endif
>  
>  #include "safe-ctype.h"
>  
> @@ -45,6 +48,10 @@ If not, see <http://www.gnu.org/licenses/>.  */
>  #include <demangle.h>
>  #include "libiberty.h"
>  
> +#ifndef ULONG_MAX
> +#define	ULONG_MAX	(~0UL)
> +#endif
> +
>  /* A mini string-handling package */
>  
>  typedef struct string		/* Beware: these aren't required to be */
> @@ -207,24 +214,24 @@ dlang_number (const char *mangled, long *ret)
>    if (mangled == NULL || !ISDIGIT (*mangled))
>      return NULL;
>  
> -  (*ret) = 0;
> +  unsigned long val = 0;
>  
>    while (ISDIGIT (*mangled))
>      {
> -      (*ret) *= 10;
> -
> -      /* If an overflow occured when multiplying by ten, the result
> -	 will not be a multiple of ten.  */
> -      if ((*ret % 10) != 0)
> +      /* Check for overflow.  Yes, we return NULL here for some digits
> +	 that don't overflow "val * 10 + digit", but that doesn't
> +	 matter given the later "(long) val < 0" test.  */
> +      if (val > (ULONG_MAX - 9) / 10)
>  	return NULL;
>  
> -      (*ret) += mangled[0] - '0';
> +      val = val * 10 + mangled[0] - '0';
>        mangled++;
>      }
>  
> -  if (*mangled == '\0' || *ret < 0)
> +  if (*mangled == '\0' || (long) val < 0)
>      return NULL;
>  
> +  *ret = val;
>    return mangled;
>  }
>  
> @@ -294,24 +301,24 @@ dlang_decode_backref (const char *mangled, long *ret)
>  	    [A-Z] NumberBackRef
>  	    ^
>     */
> -  (*ret) = 0;
> +  unsigned long val = 0;
>  
>    while (ISALPHA (*mangled))
>      {
> -      (*ret) *= 26;
> +      /* Check for overflow.  */
> +      if (val > (ULONG_MAX - 25) / 26)
> +	break;
>  
> -      /* If an overflow occured when multiplying by 26, the result
> -	 will not be a multiple of 26.  */
> -      if ((*ret % 26) != 0)
> -	return NULL;
> +      val *= 26;
>  
>        if (mangled[0] >= 'a' && mangled[0] <= 'z')
>  	{
> -	  (*ret) += mangled[0] - 'a';
> +	  val += mangled[0] - 'a';
> +	  *ret = val;
>  	  return mangled + 1;
>  	}
>  
> -      (*ret) += mangled[0] - 'A';
> +      val += mangled[0] - 'A';
>        mangled++;
>      }
>  
> -- 
> Alan Modra
> Australia Development Lab, IBM
>

next prev parent reply	other threads:[~2020-09-03 21:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-03 13:01 Alan Modra
2020-09-03 21:02 ` Iain Buclaw [this message]
2020-09-04  0:59   ` Alan Modra
2020-09-04 11:22     ` Iain Buclaw
2020-09-04 13:34       ` Alan Modra
2020-09-04 16:23         ` Iain Buclaw
2020-09-07  0:56           ` Alan Modra
2020-09-07 16:17             ` Iain Buclaw
2020-09-07 17:46               ` Ian Lance Taylor
2020-11-13 19:04         ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1599163400.8r2ly1k30n.astroid@galago.none \
    --to=ibuclaw@gdcproject.org \
    --cc=amodra@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=ian@airs.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).