From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id B21313858434; Thu, 12 Oct 2023 14:07:10 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B21313858434
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1697119630;
	bh=ekXEJZT2mBkDeux9GxpScihNAbeLqlmuOOpuQ2uHX30=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=Ohk9XpdHe++r2n+tSr4RCV2q4yCmkb9CkmXLRry8sgP95UHo4IXFV9OpxV5ZN/NAu
	 zRVvPk0KBIvEprRS1awMnLrYdEyZr40Lqnxz0Ev56HnDsD1Vzc0Ll6DlQEoaLeNlLK
	 bH5KftPG5WKI2kmWzSaA9TvyLoP0f1RZBVQ2dEIc=
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c/102989] Implement C2x's n2763 (_BitInt)
Date: Thu, 12 Oct 2023 14:07:07 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: cvs-commit at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: jakub at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-102989-4-9nHe0wBFzL@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-102989-4@http.gcc.gnu.org/bugzilla/>
References: <bug-102989-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102989
--- Comment #112 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:0d00385eaf72ccacff17935b0d214a26773e095f

commit r14-4592-g0d00385eaf72ccacff17935b0d214a26773e095f
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Oct 12 16:01:12 2023 +0200

    wide-int: Allow up to 16320 bits wide_int and change widest_int precisi=
on
to 32640 bits [PR102989]

    As mentioned in the _BitInt support thread, _BitInt(N) is currently lim=
ited
    by the wide_int/widest_int maximum precision limitation, which is depen=
ding
    on target 191, 319, 575 or 703 bits (one less than WIDE_INT_MAX_PRECISI=
ON).
    That is fairly low limit for _BitInt, especially on the targets with the
191
    bit limitation.

    The following patch bumps that limit to 16319 bits on all arches (which
support
    _BitInt at all), which is the limit imposed by INTEGER_CST representati=
on
    (unsigned char members holding number of HOST_WIDE_INT limbs).

    In order to achieve that, wide_int is changed from a trivially copyable
type
    which contained just an inline array of WIDE_INT_MAX_ELTS (3, 5, 9 or
    11 limbs depending on target) limbs into a non-trivially copy
constructible,
    copy assignable and destructible type which for the usual small cases (=
up
    to WIDE_INT_MAX_INL_ELTS which is the former WIDE_INT_MAX_ELTS) still u=
ses
    an inline array of limbs, but for larger precisions uses heap allocated
    limb array.  This makes wide_int unusable in GC structures, so for
dwarf2out
    which was the only place which needed it there is a new rwide_int type
    (restricted wide_int) which supports only up to RWIDE_INT_MAX_ELTS limbs
    inline and is trivially copyable (dwarf2out should never deal with large
    _BitInt constants, those should have been lowered earlier).

    Similarly, widest_int has been changed from a trivially copyable type w=
hich
    contained also an inline array of WIDE_INT_MAX_ELTS limbs (but unlike
    wide_int didn't contain precision and assumed that to be
    WIDE_INT_MAX_PRECISION) into a non-trivially copy constructible, copy
    assignable and destructible type which has always WIDEST_INT_MAX_PRECIS=
ION
    precision (32640 bits currently, twice as much as INTEGER_CST limitation
    allows) and unlike wide_int decides depending on get_len () value wheth=
er
    it uses an inline array (again, up to WIDE_INT_MAX_INL_ELTS) or heap
    allocated one.  In wide-int.h this means we need to estimate an upper
    bound on how many limbs will wide-int.cc (usually, sometimes wide-int.h)
    need to write, heap allocate if needed based on that estimation and upon
    set_len which is done at the end if we guessed over WIDE_INT_MAX_INL_EL=
TS
    and allocated dynamically, while we actually need less than that
    copy/deallocate.  The unexact guesses are needed because the exact
    computation of the length in wide-int.cc is sometimes quite complex and
    especially canonicalize at the end can decrease it.  widest_int is again
    because of this not usable in GC structures, so cfgloop.h has been chan=
ged
    to use fixed_wide_int_storage <WIDE_INT_MAX_INL_PRECISION> and punt if
    we'd have larger _BitInt based iterators, programs having more than 128=
-bit
    iterators will be hopefully rare and I think it is fine to treat loops =
with
    more than 2^127 iterations as effectively possibly infinite, omp-genera=
l.cc
    is changed to use fixed_wide_int_storage <1024>, as it better should
support
    scores with the same precision on all arches.

    Code which used WIDE_INT_PRINT_BUFFER_SIZE sized buffers for printing
    wide_int/widest_int into buffer had to be changed to use XALLOCAVEC for
    larger lengths.

    On x86_64, the patch in --enable-checking=3Dyes,rtl,extra configured
    bootstrapped cc1plus enlarges the .text section by 1.01% - from
    0x25725a5 to 0x25e5555 and similarly at least when compiling insn-recog=
.cc
    with the usual bootstrap option slows compilation down by 1.01%,
    user 4m22.046s and 4m22.384s on vanilla trunk vs.
    4m25.947s and 4m25.581s on patched trunk.  I'm afraid some code size gr=
owth
    and compile time slowdown is unavoidable in this case, we use wide_int =
and
    widest_int everywhere, and while the rare cases are marked with UNLIKELY
    macros, it still means extra checks for it.

    The patch also regresses
    +FAIL: gm2/pim/fail/largeconst.mod,  -O
    +FAIL: gm2/pim/fail/largeconst.mod,  -O -g
    +FAIL: gm2/pim/fail/largeconst.mod,  -O3 -fomit-frame-pointer
    +FAIL: gm2/pim/fail/largeconst.mod,  -O3 -fomit-frame-pointer
-finline-functions
    +FAIL: gm2/pim/fail/largeconst.mod,  -Os
    +FAIL: gm2/pim/fail/largeconst.mod,  -g
    +FAIL: gm2/pim/fail/largeconst2.mod,  -O
    +FAIL: gm2/pim/fail/largeconst2.mod,  -O -g
    +FAIL: gm2/pim/fail/largeconst2.mod,  -O3 -fomit-frame-pointer
    +FAIL: gm2/pim/fail/largeconst2.mod,  -O3 -fomit-frame-pointer
-finline-functions
    +FAIL: gm2/pim/fail/largeconst2.mod,  -Os
    +FAIL: gm2/pim/fail/largeconst2.mod,  -g
    tests, which previously were rejected with
    error: constant literal
=C3=A2123456789123456789123456791234567891234567891234567891234567891234567=
912345678912345678912345678912345678912345679123456789123456789123456789123=
45678912345679123456789123456789=C3=A2
exceeds internal ZTYPE range
    kind of errors, but now are accepted.  Seems the FE tries to parse
constants
    into widest_int in that case and only diagnoses if widest_int overflows,
    that seems wrong, it should at least punt if stuff doesn't fit into
    WIDE_INT_MAX_PRECISION, but perhaps far less than that, if it wants sup=
port
    for middle-end for precisions above 128-bit, it better should be using
    BITINT_TYPE.  Will file a PR and defer to Modula2 maintainer.

    2023-10-12  Jakub Jelinek  <jakub@redhat.com>

            PR c/102989
            * wide-int.h: Adjust file comment.
            (WIDE_INT_MAX_INL_ELTS): Define to former value of
WIDE_INT_MAX_ELTS.
            (WIDE_INT_MAX_INL_PRECISION): Define.
            (WIDE_INT_MAX_ELTS): Change to 255.  Assert that
WIDE_INT_MAX_INL_ELTS
            is smaller than WIDE_INT_MAX_ELTS.
            (RWIDE_INT_MAX_ELTS, RWIDE_INT_MAX_PRECISION, WIDEST_INT_MAX_EL=
TS,
            WIDEST_INT_MAX_PRECISION): Define.
            (WI_BINARY_RESULT_VAR, WI_UNARY_RESULT_VAR): Change write_val
callers
            to pass 0 as a new argument.
            (class widest_int_storage): Likewise.
            (widest_int, widest2_int): Change typedefs to use
widest_int_storage
            rather than fixed_wide_int_storage.
            (enum wi::precision_type): Add INL_CONST_PRECISION enumerator.
            (struct binary_traits): Add partial specializations for
            INL_CONST_PRECISION.
            (generic_wide_int): Add needs_write_val_arg static data member.
            (int_traits): Likewise.
            (wide_int_storage): Replace val non-static data member with a u=
nion
            u of it and HOST_WIDE_INT *valp.  Declare copy constructor, copy
            assignment operator and destructor.  Add unsigned int argument =
to
            write_val.
            (wide_int_storage::wide_int_storage): Initialize precision to 0
            in the default ctor.  Remove unnecessary {}s around STATIC_ASSE=
RTs.
            Assert in non-default ctor T's precision_type is not
            INL_CONST_PRECISION and allocate u.valp for large precision.  A=
dd
            copy constructor.
            (wide_int_storage::~wide_int_storage): New.
            (wide_int_storage::operator=3D): Add copy assignment operator. =
 In
            assignment operator remove unnecessary {}s around STATIC_ASSERT=
s,
            assert ctor T's precision_type is not INL_CONST_PRECISION and
            if precision changes, deallocate and/or allocate u.valp.
            (wide_int_storage::get_val): Return u.valp rather than u.val for
            large precision.
            (wide_int_storage::write_val): Likewise.  Add an unused unsigned
int
            argument.
            (wide_int_storage::set_len): Use write_val instead of writing v=
al
            directly.
            (wide_int_storage::from, wide_int_storage::from_array): Adjust
            write_val callers.
            (wide_int_storage::create): Allocate u.valp for large precision=
s.
            (wi::int_traits <wide_int_storage>::get_binary_precision): New.
            (fixed_wide_int_storage::fixed_wide_int_storage): Make default
            ctor defaulted.
            (fixed_wide_int_storage::write_val): Add unused unsigned int
argument.
            (fixed_wide_int_storage::from, fixed_wide_int_storage::from_arr=
ay):
            Adjust write_val callers.
            (wi::int_traits <fixed_wide_int_storage>::get_binary_precision):
New.
            (WIDEST_INT): Define.
            (widest_int_storage): New template class.
            (wi::int_traits <widest_int_storage>): New.
            (trailing_wide_int_storage::write_val): Add unused unsigned int
            argument.
            (wi::get_binary_precision): Use
            wi::int_traits <WI_BINARY_RESULT (T1, T2)>::get_binary_precision
            rather than get_precision on get_binary_result.
            (wi::copy): Adjust write_val callers.  Don't call set_len if
            needs_write_val_arg.
            (wi::bit_not): If result.needs_write_val_arg, call write_val
            again with upper bound estimate of len.
            (wi::sext, wi::zext, wi::set_bit): Likewise.
            (wi::bit_and, wi::bit_and_not, wi::bit_or, wi::bit_or_not,
            wi::bit_xor, wi::add, wi::sub, wi::mul, wi::mul_high,
wi::div_trunc,
            wi::div_floor, wi::div_ceil, wi::div_round, wi::divmod_trunc,
            wi::mod_trunc, wi::mod_floor, wi::mod_ceil, wi::mod_round,
            wi::lshift, wi::lrshift, wi::arshift): Likewise.
            (wi::bswap, wi::bitreverse): Assert result.needs_write_val_arg
            is false.
            (gt_ggc_mx, gt_pch_nx): Remove generic template for all
            generic_wide_int, instead add functions and templates for each
            storage of generic_wide_int.  Make functions for
            generic_wide_int <wide_int_storage> and templates for
            generic_wide_int <widest_int_storage <N>> deleted.
            (wi::mask, wi::shifted_mask): Adjust write_val calls.
            * wide-int.cc (zeros): Decrease array size to 1.
            (BLOCKS_NEEDED): Use CEIL.
            (canonize): Use HOST_WIDE_INT_M1.
            (wi::from_buffer): Pass 0 to write_val.
            (wi::to_mpz): Use CEIL.
            (wi::from_mpz): Likewise.  Pass 0 to write_val.  Use
            WIDE_INT_MAX_INL_ELTS instead of WIDE_INT_MAX_ELTS.
            (wi::mul_internal): Use WIDE_INT_MAX_INL_PRECISION instead of
            MAX_BITSIZE_MODE_ANY_INT in automatic array sizes, for prec
            above WIDE_INT_MAX_INL_PRECISION estimate precision from
            lengths of operands.  Use XALLOCAVEC allocated buffers for
            prec above WIDE_INT_MAX_INL_PRECISION.
            (wi::divmod_internal): Likewise.
            (wi::lshift_large): For len > WIDE_INT_MAX_INL_ELTS estimate
            it from xlen and skip.
            (rshift_large_common): Remove xprecision argument, add len
            argument with len computed in caller.  Don't return anything.
            (wi::lrshift_large, wi::arshift_large): Compute len here
            and pass it to rshift_large_common, for lengths above
            WIDE_INT_MAX_INL_ELTS using estimations from xlen if possible.
            (assert_deceq, assert_hexeq): For lengths above
            WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer.
            (test_printing): Use WIDE_INT_MAX_INL_PRECISION instead of
            WIDE_INT_MAX_PRECISION.
            * wide-int-print.h (WIDE_INT_PRINT_BUFFER_SIZE): Use
            WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION.
            * wide-int-print.cc (print_decs, print_decu, print_hex): For
            lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated
buffer.
            * tree.h (wi::int_traits<extended_tree <N>>): Change precision_=
type
            to INL_CONST_PRECISION for N =3D=3D ADDR_MAX_PRECISION.
            (widest_extended_tree): Use WIDEST_INT_MAX_PRECISION instead of
            WIDE_INT_MAX_PRECISION.
            (wi::ints_for): Use int_traits <extended_tree <N> >::precision_=
type
            instead of hard coded CONST_PRECISION.
            (widest2_int_cst): Use WIDEST_INT_MAX_PRECISION instead of
            WIDE_INT_MAX_PRECISION.
            (wi::extended_tree <N>::get_len): Use WIDEST_INT_MAX_PRECISION
rather
            than WIDE_INT_MAX_PRECISION.
            (wi::ints_for::zero): Use
            wi::int_traits <wi::extended_tree <N> >::precision_type instead=
 of
            wi::CONST_PRECISION.
            * tree.cc (build_replicated_int_cst): Formatting fix.  Use
            WIDE_INT_MAX_INL_ELTS rather than WIDE_INT_MAX_ELTS.
            * print-tree.cc (print_node): Don't print TREE_UNAVAILABLE on
            INTEGER_CSTs, TREE_VECs or SSA_NAMEs.
            * double-int.h (wi::int_traits <double_int>::precision_type):
Change
            to INL_CONST_PRECISION from CONST_PRECISION.
            * poly-int.h (struct poly_coeff_traits): Add partial specializa=
tion
            for wi::INL_CONST_PRECISION.
            * cfgloop.h (bound_wide_int): New typedef.
            (struct nb_iter_bound): Change bound type from widest_int to
            bound_wide_int.
            (struct loop): Change nb_iterations_upper_bound,
            nb_iterations_likely_upper_bound and nb_iterations_estimate type
from
            widest_int to bound_wide_int.
            * cfgloop.cc (record_niter_bound): Return early if
wi::min_precision
            of i_bound is too large for bound_wide_int.  Adjustments for the
            widest_int to bound_wide_int type change in non-static data
members.
            (get_estimated_loop_iterations, get_max_loop_iterations,
            get_likely_max_loop_iterations): Adjustments for the widest_int=
 to
            bound_wide_int type change in non-static data members.
            * tree-vect-loop.cc (vect_transform_loop): Likewise.
            * tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations=
):
Use
            XALLOCAVEC allocated buffer for i_bound len above
            WIDE_INT_MAX_INL_ELTS.
            (record_estimate): Return early if wi::min_precision of i_bound=
 is
too
            large for bound_wide_int.  Adjustments for the widest_int to
            bound_wide_int type change in non-static data members.
            (wide_int_cmp): Use bound_wide_int instead of widest_int.
            (bound_index): Use bound_wide_int instead of widest_int.
            (discover_iteration_bound_by_body_walk): Likewise.  Use
            widest_int::from to convert it to widest_int when passed to
            record_niter_bound.
            (maybe_lower_iteration_bound): Use widest_int::from to convert =
it
to
            widest_int when passed to record_niter_bound.
            (estimate_numbers_of_iteration): Don't record upper bound if
            loop->nb_iterations has too large precision for bound_wide_int.
            (n_of_executions_at_most): Use widest_int::from.
            * tree-ssa-loop-ivcanon.cc (remove_redundant_iv_tests): Adjust =
for
            the widest_int to bound_wide_int changes.
            * match.pd (fold_sign_changed_comparison simplification): Use
            wide_int::from on wi::to_wide instead of wi::to_widest.
            * value-range.h (irange::maybe_resize): Avoid using memcpy on
            non-trivially copyable elements.
            * value-range.cc (irange_bitmask::dump): Use XALLOCAVEC allocat=
ed
            buffer for mask or value len above WIDE_INT_PRINT_BUFFER_SIZE.
            * fold-const.cc (fold_convert_const_int_from_int, fold_unary_lo=
c):
            Use wide_int::from on wi::to_wide instead of wi::to_widest.
            * tree-ssa-ccp.cc (bit_value_binop): Zero extend r1max from wid=
th
            before calling wi::udiv_trunc.
            * lto-streamer-out.cc (output_cfg): Adjustments for the widest_=
int
to
            bound_wide_int type change in non-static data members.
            * lto-streamer-in.cc (input_cfg): Likewise.
            (lto_input_tree_1): Use WIDE_INT_MAX_INL_ELTS rather than
            WIDE_INT_MAX_ELTS.  For length above WIDE_INT_MAX_INL_ELTS use
            XALLOCAVEC allocated buffer.  Formatting fix.
            * data-streamer-in.cc (streamer_read_wide_int,
            streamer_read_widest_int): Likewise.
            * tree-affine.cc (aff_combination_expand): Use placement new to
            construct name_expansion.
            (free_name_expansion): Destruct name_expansion.
            * gimple-ssa-strength-reduction.cc (struct slsr_cand_d): Change
            index type from widest_int to offset_int.
            (class incr_info_d): Change incr type from widest_int to
offset_int.
            (alloc_cand_and_find_basis, backtrace_base_for_ref,
            restructure_reference, slsr_process_ref, create_mul_ssa_cand,
            create_mul_imm_cand, create_add_ssa_cand, create_add_imm_cand,
            slsr_process_add, cand_abs_increment, replace_mult_candidate,
            replace_unconditional_candidate, incr_vec_index,
            create_add_on_incoming_edge, create_phi_basis_1,
            replace_conditional_candidate, record_increment,
            record_phi_increments_1, phi_incr_cost_1, phi_incr_cost,
            lowest_cost_path, total_savings, ncd_with_phi,
ncd_of_cand_and_phis,
            nearest_common_dominator_for_cands, insert_initializers,
            all_phi_incrs_profitable_1, replace_one_candidate,
            replace_profitable_candidates): Use offset_int rather than
widest_int
            and wi::to_offset rather than wi::to_widest.
            * real.cc (real_to_integer): Use WIDE_INT_MAX_INL_ELTS rather t=
han
            2 * WIDE_INT_MAX_ELTS and for words above that use XALLOCAVEC
            allocated buffer.
            * tree-ssa-loop-ivopts.cc (niter_for_exit): Use placement new
            to construct tree_niter_desc and destruct it on failure.
            (free_tree_niter_desc): Destruct tree_niter_desc if value is
non-NULL.
            * gengtype.cc (main): Remove widest_int handling.
            * graphite-isl-ast-to-gimple.cc (widest_int_from_isl_expr_int):=
 Use
            WIDEST_INT_MAX_ELTS instead of WIDE_INT_MAX_ELTS.
            * gimple-ssa-warn-alloca.cc (pass_walloca::execute): Use
            WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION and
            assert get_len () fits into it.
            * value-range-pretty-print.cc
(vrange_printer::print_irange_bitmasks):
            For mask or value lengths above WIDE_INT_MAX_INL_ELTS use
XALLOCAVEC
            allocated buffer.
            * gimple-ssa-sprintf.cc (adjust_range_for_overflow): Use
            wide_int::from on wi::to_wide instead of wi::to_widest.
            * omp-general.cc (score_wide_int): New typedef.
            (omp_context_compute_score): Use score_wide_int instead of
widest_int
            and adjust for those changes.
            (struct omp_declare_variant_entry): Change score and
            score_in_declare_simd_clone non-static data member type from
widest_int
            to score_wide_int.
            (omp_resolve_late_declare_variant, omp_resolve_declare_variant):
Use
            score_wide_int instead of widest_int and adjust for those chang=
es.
            (omp_lto_output_declare_variant_alt): Likewise.
            (omp_lto_input_declare_variant_alt): Likewise.
            * godump.cc (go_output_typedef): Assert get_len () is smaller t=
han
            WIDE_INT_MAX_INL_ELTS.
    gcc/c-family/
            * c-warn.cc (match_case_to_enum_1): Use wi::to_wide just once
instead
            of 3 times, assert get_len () is smaller than
WIDE_INT_MAX_INL_ELTS.
    gcc/testsuite/
            * gcc.dg/bitint-38.c: New test.=