public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jonathan Wakely <jwakely@redhat.com>
To: Cassio Neri <cassio.neri@gmail.com>
Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH v5] libstdc++: Remove UB from month and weekday additions and subtractions.
Date: Sun, 10 Dec 2023 13:03:16 +0000	[thread overview]
Message-ID: <CACb0b4ndWA1S9Tct71JDucuxqiXUpzNainra=P-JwSa7_raZ2g@mail.gmail.com> (raw)
In-Reply-To: <20231125135217.7180-1-cassio.neri@gmail.com>

On Sat, 25 Nov 2023 at 13:52, Cassio Neri wrote:
>
> The following invoke signed integer overflow (UB) [1]:
>
>   month   + months{MAX} // where MAX is the maximum value of months::rep
>   month   + months{MIN} // where MIN is the maximum value of months::rep
>   month   - months{MIN} // where MIN is the minimum value of months::rep
>   weekday + days  {MAX} // where MAX is the maximum value of days::rep
>   weekday - days  {MIN} // where MIN is the minimum value of days::rep
>
> For the additions to MAX, the crux of the problem is that, in libstdc++,
> months::rep and days::rep are int64_t. Other implementations use int32_t, cast
> operands to int64_t and perform arithmetic operations without risk of
> overflowing.
>
> For month + months{MIN}, the implementation follows the Standard's "returns
> clause" and evaluates:
>
>    modulo(static_cast<long long>(unsigned{__x}) + (__y.count() - 1), 12);
>
> Overflow occurs when MIN - 1 is evaluated. Casting to a larger type could help
> but, unfortunately again, this is not possible for libstdc++.
>
> For the subtraction of MIN, the problem is that -MIN is not representable.
>
> It's fair to say that the intention is for these additions/subtractions to
> be performed in modulus (12 or 7) arithmetic so that no overflow is expected.
>
> To fix these UB, this patch implements:
>
>   template <unsigned __d, typename _T>
>   unsigned __add_modulo(unsigned __x, _T __y);
>
>   template <unsigned __d, typename _T>
>   unsigned __sub_modulo(unsigned __x, _T __y);
>
> which respectively, returns the remainder of Euclidean division of, __x + __y
> and __x - __y by __d without overflowing. These functions replace
>
>   constexpr unsigned __modulo(long long __n, unsigned __d);
>
> which also calculates the reminder of __n, where __n is the result of the
> addition or subtraction. Hence, these operations might invoke UB before __modulo
> is called and thus, __modulo can't do anything to remediate the issue.
>
> In addition to solve the UB issues, __add_modulo and __sub_modulo allow better
> codegen (shorter and branchless) on x86-64 and ARM [2].
>
> [1] https://godbolt.org/z/a9YfWdn57
> [2] https://godbolt.org/z/Gh36cr7E4
>
> libstdc++-v3/ChangeLog:
>
>         * include/std/chrono: Fix + and - for months and weekdays.
>         * testsuite/std/time/month/1.cc: Add constexpr tests against overflow.
>         * testsuite/std/time/month/2.cc: New test for extreme values.
>         * testsuite/std/time/weekday/1.cc: Add constexpr tests against overflow.
>         * testsuite/std/time/weekday/2.cc: New test for extreme values.
> ---
> Good for trunk?
>
> Changes with respect to previous versions:
>   v5: Fix typos in commit message.
>   v4: Remove UB also from operator-.
>   v3: Fix screwed up email send with v2.
>   v2: Replace _T with _Tp and _U with _Up. Remove copyright+license from test.
>
>  libstdc++-v3/include/std/chrono              | 79 +++++++++++++-------
>  libstdc++-v3/testsuite/std/time/month/1.cc   | 19 +++++
>  libstdc++-v3/testsuite/std/time/month/2.cc   | 32 ++++++++
>  libstdc++-v3/testsuite/std/time/weekday/1.cc | 16 +++-
>  libstdc++-v3/testsuite/std/time/weekday/2.cc | 32 ++++++++
>  5 files changed, 151 insertions(+), 27 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/std/time/month/2.cc
>  create mode 100644 libstdc++-v3/testsuite/std/time/weekday/2.cc
>
> diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
> index e4ba6eafceb..1b7cdb96e1c 100644
> --- a/libstdc++-v3/include/std/chrono
> +++ b/libstdc++-v3/include/std/chrono
> @@ -501,18 +501,47 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>      namespace __detail
>      {
> -      // Compute the remainder of the Euclidean division of __n divided by __d.
> -      // Euclidean division truncates toward negative infinity and always
> -      // produces a remainder in the range of [0,__d-1] (whereas standard
> -      // division truncates toward zero and yields a nonpositive remainder
> -      // for negative __n).
> +      // Helper to __add_modulo and __sub_modulo.
> +      template <unsigned __d, typename _Tp>
> +      constexpr auto

I think this can be consteval. We only use it in C++20 code.

Otherwise this looks good, so I'll make that change and push to trunk. Thanks.

P.S. for some targets we *could* do the arithmetic using 128-bit
integers, but we'd still need this code for targets that don't support
that type. So I think to keep things simpler we might as well just use
this code everywhere. If using 128-bit integers would produce
significantly better codegen we can consider that later.


> +      __modulo_offset()
> +      {
> +       using _Up = make_unsigned_t<_Tp>;
> +       auto constexpr __a = _Up(-1) - _Up(255 + __d - 2);
> +       auto constexpr __b = _Up(__d * (__a / __d) - 1);
> +       // Notice: b <= a - 1 <= _Up(-1) - (255 + d - 1) and b % d = d - 1.
> +       return _Up(-1) - __b; // >= 255 + d - 1
> +      }
> +
> +      // Compute the remainder of the Euclidean division of __x + __y divided by
> +      // __d without overflowing.  Typically, __x <= 255 + d - 1 is sum of
> +      // weekday/month with a shift in [0, d - 1] and __y is a duration count.
> +      template <unsigned __d, typename _Tp>
> +      constexpr unsigned
> +      __add_modulo(unsigned __x, _Tp __y)
> +      {
> +       using _Up = make_unsigned_t<_Tp>;
> +       // For __y >= 0, _Up(__y) has the same mathematical value as __y and
> +       // this function simply returns (__x + _Up(__y)) % d.  Typically, this
> +       // doesn't overflow since the range of _Up contains many more positive
> +       // values than _Tp's.  For __y < 0, _Up(__y) has a mathematical value in
> +       // the upper-half range of _Up so that adding a positive value to it
> +       // might overflow.  Moreover, most likely, _Up(__y) != __y mod d.  To
> +       // fix both issues we subtract from _Up(__y) an __offset >=
> +       // 255 + d - 1 to make room for the addition to __x and shift the modulo
> +       // to the correct value.
> +       auto const __offset = __y >= 0 ? _Up(0) : __modulo_offset<__d, _Tp>();
> +       return (__x + _Up(__y) - __offset) % __d;
> +      }
> +
> +      // Similar to __add_modulo but for __x - __y.
> +      template <unsigned __d, typename _Tp>
>        constexpr unsigned
> -      __modulo(long long __n, unsigned __d)
> +      __sub_modulo(unsigned __x, _Tp __y)
>        {
> -       if (__n >= 0)
> -         return __n % __d;
> -       else
> -         return (__d + (__n % __d)) % __d;
> +       using _Up = make_unsigned_t<_Tp>;
> +       auto const __offset = __y <= 0 ? _Up(0) : __modulo_offset<__d, _Tp>();
> +       return (__x - _Up(__y) - __offset) % __d;
>        }
>
>        inline constexpr unsigned __days_per_month[12]
> @@ -704,8 +733,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        friend constexpr month
>        operator+(const month& __x, const months& __y) noexcept
>        {
> -       auto __n = static_cast<long long>(unsigned{__x}) + (__y.count() - 1);
> -       return month{__detail::__modulo(__n, 12) + 1};
> +       // modulo(x + (y - 1), 12) = modulo(x + (y - 1) + 12, 12)
> +       //                         = modulo((x + 11) + y    , 12)
> +       return month{1 + __detail::__add_modulo<12>(
> +         unsigned{__x} + 11, __y.count())};
>        }
>
>        friend constexpr month
> @@ -714,7 +745,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>        friend constexpr month
>        operator-(const month& __x, const months& __y) noexcept
> -      { return __x + -__y; }
> +      {
> +       // modulo(x + (-y - 1), 12) = modulo(x + (-y - 1) + 12, 12)
> +       //                          = modulo((x + 11) - y     , 12)
> +       return month{1 + __detail::__sub_modulo<12>(
> +         unsigned{__x} + 11, __y.count())};
> +      }
>
>        friend constexpr months
>        operator-(const month& __x,  const month& __y) noexcept
> @@ -934,15 +970,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        static constexpr weekday
>        _S_from_days(const days& __d)
>        {
> -       using _Rep = days::rep;
> -       using _URep = make_unsigned_t<_Rep>;
> -       const auto __n = __d.count();
> -       const auto __m = static_cast<_URep>(__n);
> -
> -       // 1970-01-01 (__n =  0, __m = 0        ) -> Thursday (4)
> -       // 1969-31-12 (__n = -1, __m = _URep(-1)) -> Wednesday (3)
> -       const auto __offset = __n >= 0 ? _URep(4) : 3 - _URep(-1) % 7 - 7;
> -       return weekday((__m + __offset) % 7);
> +       return weekday{__detail::__add_modulo<7>(4, __d.count())};
>        }
>
>      public:
> @@ -1032,8 +1060,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        friend constexpr weekday
>        operator+(const weekday& __x, const days& __y) noexcept
>        {
> -       auto __n = static_cast<long long>(__x._M_wd) + __y.count();
> -       return weekday{__detail::__modulo(__n, 7)};
> +       return weekday{__detail::__add_modulo<7>(__x._M_wd, __y.count())};
>        }
>
>        friend constexpr weekday
> @@ -1042,7 +1069,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>        friend constexpr weekday
>        operator-(const weekday& __x, const days& __y) noexcept
> -      { return __x + -__y; }
> +      {
> +       return weekday{__detail::__sub_modulo<7>(__x._M_wd, __y.count())};
> +      }
>
>        friend constexpr days
>        operator-(const weekday& __x, const weekday& __y) noexcept
> diff --git a/libstdc++-v3/testsuite/std/time/month/1.cc b/libstdc++-v3/testsuite/std/time/month/1.cc
> index 773f772bf07..1b3612f27e3 100644
> --- a/libstdc++-v3/testsuite/std/time/month/1.cc
> +++ b/libstdc++-v3/testsuite/std/time/month/1.cc
> @@ -20,6 +20,7 @@
>  // Class template day [time.cal.month]
>
>  #include <chrono>
> +#include <limits>
>
>  constexpr void
>  constexpr_month()
> @@ -34,6 +35,24 @@ constexpr_month()
>    dm += months{3};
>    dm -= months{3};
>
> +  // Test for UB (overflow).
> +  {
> +    using rep = months::rep;
> +    using std::numeric_limits;
> +
> +    auto constexpr months_min = months{numeric_limits<rep>::min()};
> +    auto constexpr month_000_plus_months_min  = month{ 0 } + months_min;
> +    auto constexpr month_255_plus_months_min  = month{255} + months_min;
> +    auto constexpr month_000_minus_months_min = month{ 0 } - months_min;
> +    auto constexpr month_255_minus_months_min = month{255} - months_min;
> +
> +    auto constexpr months_max = months{numeric_limits<rep>::max()};
> +    auto constexpr month_000_plus_months_max  = month{ 0 } + months_max;
> +    auto constexpr month_255_plus_months_max  = month{255} + months_max;
> +    auto constexpr month_000_minus_months_max = month{ 0 } - months_max;
> +    auto constexpr month_255_minus_months_max = month{255} - months_max;
> +  }
> +
>    static_assert(February + months{11} == January);
>    static_assert(January + months{1200} == January);
>    static_assert(January + months{1201} == February);
> diff --git a/libstdc++-v3/testsuite/std/time/month/2.cc b/libstdc++-v3/testsuite/std/time/month/2.cc
> new file mode 100644
> index 00000000000..3bcefa60003
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/std/time/month/2.cc
> @@ -0,0 +1,32 @@
> +// { dg-do run { target c++20 } }
> +
> +// Class month [time.cal.month]
> +
> +#include <chrono>
> +#include <limits>
> +#include <testsuite_hooks.h>
> +
> +using namespace std::chrono;
> +
> +void test_extreme_values(months extreme)
> +{
> +  auto const count = extreme.count();
> +  auto const safe  = count < 0 ? count + 12 : count;
> +  auto const mod   = safe - 12 * ((safe < 0 ? safe - 11 : safe) / 12);
> +
> +  for (unsigned m = 0; m < 256; ++m)
> +  {
> +    auto const month_plus_extreme = month{m} + extreme;
> +    VERIFY(unsigned{month_plus_extreme } == (m + 11 + mod) % 12 + 1);
> +
> +    auto const month_minus_extreme = month{m} - extreme;
> +    VERIFY(unsigned{month_minus_extreme} == (m + 11 - mod) % 12 + 1);
> +  }
> +}
> +
> +int main()
> +{
> +  test_extreme_values(months{std::numeric_limits<months::rep>::max()});
> +  test_extreme_values(months{std::numeric_limits<months::rep>::min()});
> +  return 0;
> +}
> diff --git a/libstdc++-v3/testsuite/std/time/weekday/1.cc b/libstdc++-v3/testsuite/std/time/weekday/1.cc
> index e89fca47d4b..2f0249b28a4 100644
> --- a/libstdc++-v3/testsuite/std/time/weekday/1.cc
> +++ b/libstdc++-v3/testsuite/std/time/weekday/1.cc
> @@ -42,8 +42,20 @@ constexpr_weekday()
>    {
>      using rep = days::rep;
>      using std::numeric_limits;
> -    constexpr weekday max{sys_days{days{numeric_limits<rep>::max()}}};
> -    constexpr weekday min{sys_days{days{numeric_limits<rep>::min()}}};
> +
> +    auto constexpr days_min = days{numeric_limits<rep>::min()};
> +    auto constexpr weekday_from_sysdays_min   = weekday{sys_days{days_min}};
> +    auto constexpr weekday_000_plus_days_min  = weekday{ 0 } + days_min;
> +    auto constexpr weekday_255_plus_days_min  = weekday{255} + days_min;
> +    auto constexpr weekday_000_minus_days_min = weekday{ 0 } - days_min;
> +    auto constexpr weekday_255_minus_days_min = weekday{255} - days_min;
> +
> +    auto constexpr days_max = days{numeric_limits<rep>::max()};
> +    auto constexpr weekday_from_sysdays_max   = weekday{sys_days{days_max}};
> +    auto constexpr weekday_000_plus_days_max  = weekday{ 0 } + days_max;
> +    auto constexpr weekday_255_plus_days_max  = weekday{255} + days_max;
> +    auto constexpr weekday_000_minus_days_max = weekday{ 0 } - days_max;
> +    auto constexpr weekday_255_minus_days_max = weekday{255} - days_max;
>    }
>
>    static_assert(weekday{sys_days{1900y/January/1}} == Monday);
> diff --git a/libstdc++-v3/testsuite/std/time/weekday/2.cc b/libstdc++-v3/testsuite/std/time/weekday/2.cc
> new file mode 100644
> index 00000000000..924709321e5
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/std/time/weekday/2.cc
> @@ -0,0 +1,32 @@
> +// { dg-do run { target c++20 } }
> +
> +// Class weekday [time.cal.wd]
> +
> +#include <chrono>
> +#include <limits>
> +#include <testsuite_hooks.h>
> +
> +using namespace std::chrono;
> +
> +void test_extreme_values(days extreme)
> +{
> +  auto const count = extreme.count();
> +  auto const safe  = count < 0 ? count + 7 : count;
> +  auto const mod   = safe - 7 * ((safe < 0 ? safe - 6 : safe) / 7);
> +
> +  for (unsigned d = 0; d < 254; ++d)
> +  {
> +    auto const weekday_plus_extreme = weekday{d} + extreme;
> +    VERIFY(weekday_plus_extreme.c_encoding()  == (d + mod) % 7);
> +
> +    auto const weekday_minus_extreme = weekday{d} - extreme;
> +    VERIFY(weekday_minus_extreme.c_encoding() == (d + 7 - mod) % 7);
> +  }
> +}
> +
> +int main()
> +{
> +  test_extreme_values(days{std::numeric_limits<days::rep>::max()});
> +  test_extreme_values(days{std::numeric_limits<days::rep>::min()});
> +  return 0;
> +}
> --
> 2.41.0
>


      parent reply	other threads:[~2023-12-10 13:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-17  1:34 [PATCH] libstdc++: Remove UB from operator+ of months and weekdays Cassio Neri
2023-11-17 10:15 ` Jonathan Wakely
2023-11-18  0:06   ` [PATCH v2] The following functions invoke signed integer overflow (UB) for some extreme values of days and months [1]: Cassio Neri
2023-11-18  0:19     ` [PATCH v3] libstdc++: Remove UB from operator+ of months and weekdays Cassio Neri
2023-11-18 11:07       ` Cassio Neri
2023-11-21  2:06       ` [PATCH v4] libstdc++: Remove UB from month and weekday additions and subtractions Cassio Neri
2023-11-25 13:52         ` [PATCH v5] " Cassio Neri
2023-12-01 18:00           ` ping: " Cassio Neri
2023-12-10 11:31           ` [PING,PATCH " Cassio Neri
2023-12-10 13:03           ` Jonathan Wakely [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACb0b4ndWA1S9Tct71JDucuxqiXUpzNainra=P-JwSa7_raZ2g@mail.gmail.com' \
    --to=jwakely@redhat.com \
    --cc=cassio.neri@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).