From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 1DF903857031; Tue, 9 May 2023 12:00:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1DF903857031 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1683633623; bh=KD4pDxHJ25Cx5d1S7wOj1vFwBp+eeWXJ+eTuZilej2Q=; h=From:To:Subject:Date:In-Reply-To:References:From; b=eYVr/0/X5p1aylVjombHY7ZoCsVNFJ4ZpPdivRYt+ulGUD52tdIu3/td5ABN+om4b 8XQGgt1iP+E/wuOK1nNiOvy5BUxZZvpuLh9BJ0RJNZVuS7sEvbV8qTi5Zn3hUSO5DL BzmsL2x3hKZ25IFvzIz0gVV1aEhW5BW4USUrYPls= From: "aldyh at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/109695] [14 Regression] crash in gimple_ranger::range_of_expr since r14-377-gc92b8be9b52b7e Date: Tue, 09 May 2023 12:00:20 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: ice-on-valid-code X-Bugzilla-Severity: normal X-Bugzilla-Who: aldyh at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: aldyh at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109695 --- Comment #23 from Aldy Hernandez --- An update on the int_range_max memory bloat work. As Andrew mentioned, having int_range<25> solves the problem, but is just kicking the can down the road. I ran some stats on what we actually need o= n a bootstrap, and 99.7% of ranges fit in a 3 sub-range range, but we need more= to represent switches, etc. There's no clear winner for choosing , as the distribution for anything = past <3> is rather random. What I did see was that at no point do we need more = than 125 sub-ranges (on a set of .ii files from a boostrap). I've implemented various alternatives using a dynamic approach similar to w= hat we do for auto_vec. I played with allocating 2x as much as needed, and allocating 10 or 20 more than needed, as well going from N to 255 in one go= .=20 All of it required some shuffling to make sure the penalty isn't much wrt virtuals, etc, but I think the dynamic approach is the way to go. The question is how much of a performance hit are we willing take in order = to reduce the memory footprint. Memory to speed is a linear relationship here= , so we just have to pick a number we're happy with. Here are some numbers for various sub-ranges (the sub-ranges grow automatic= ally in union, intersect, invert, and assignment, which are the methods that gro= w in sub-ranges). trunk (wide_ints <255>) =3D> 40912 bytes=20=20 GCC 12 (trees <255>) =3D> 4112 bytes auto_int_range<2> =3D> 432 bytes (5.14% penalty for VRP) auto_int_range<3> =3D> 592 bytes (4.01% penalty) auto_int_range<8> =3D> 1392 bytes (2.68% penalty) auto_int_range<10> =3D> 1712 bytes (2.14% penalty) As you can see, even at N=3D10, we're still 24X smaller than trunk and 2.4X smaller than GCC12 for a 2.14% performance drop. I'm tempted to just pick a number and tweak this later as we have ultimate flexibility here. Plus, we can also revert to a very small N, and have pas= ses that care about switches use their own temporaries (auto_int_range<20> or such). Note that we got a 13.22% improvement for the wide_int+legacy work, so even= the absolute worst case of a 5.14% penalty would have us sitting on a net 8.76% improvement over GCC12. Bike shedding welcome ;-)=