From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id A8A253858D20 for ; Mon, 15 May 2023 17:23:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A8A253858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684171433; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wxySQmEY+Zf7q3ESPSYFq29aqPl88Ngj0orW7FsMsrs=; b=P4Vy5vS4pmUty/BgrFED0/f5ZvReojXqTxqYhCJKyHE+Xe8MLpTSicgpfg78wNNR3n+NFj 6bwXutK1lu/lGdM0gwSNRPq7zOig9j2mTtDz6bo0W2zE955MTSqhu/JHoYW9gT3GVMZN2C jhXVwZMnwDE848tBA09uAmhzStqedr8= Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com [209.85.218.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-568-KTVwYxo8PYGQv2nRYZblWw-1; Mon, 15 May 2023 13:23:51 -0400 X-MC-Unique: KTVwYxo8PYGQv2nRYZblWw-1 Received: by mail-ej1-f71.google.com with SMTP id a640c23a62f3a-965b73d8b7eso1296610666b.2 for ; Mon, 15 May 2023 10:23:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684171430; x=1686763430; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wxySQmEY+Zf7q3ESPSYFq29aqPl88Ngj0orW7FsMsrs=; b=ZANShU7njUOIRTB6qafMmjciqNB881kDH+dr443JY6zTYGjPBh0o2oFyE9y3DgxcPS tAOSSKF72clXqTJgnOjGjE537gzBQIX0Hd3aPt4TJvh7tApxQ8j1e2MEg1x3FzD7BnR2 vfDmT0jJP+60H5zVE7qV81qUctQSJdTrb18prDbLN+DTsmB5ZcNkbBTawZ7nSWjOHV0e 5mrUZzmpHF+Vx8/gEP8VtSXYAKA7yz2Q0nHkmWTDwVr6eh2UMiMkWSXrO7WMd5n1o/jO 0RtG70H4a8nMgKcyweTbs0QqwTb9ZaAMbTZasktLEylbvkkH+KieRyPu8AJ++p7lH3vh NOLw== X-Gm-Message-State: AC+VfDyAArSdNCvXkwzgqDaxKDZvDBxJFYbIbQYVPyxOHRy+JY/Jo9UE tLPckaQeKydwNchH6rdXHrTwFANAqmfoSq1eTqreCIZZeWG3xF6U8rbUUoIi8Edt7ZEkzzEvo27 ZL2e+SfCejJwABr0fIkDfZEWD+Gw1zjItQw== X-Received: by 2002:a17:906:ee86:b0:962:46d7:c8fc with SMTP id wt6-20020a170906ee8600b0096246d7c8fcmr34312185ejb.21.1684171430643; Mon, 15 May 2023 10:23:50 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5SZDh3z0ZbxMYIx2a+rLbwjjHHG09zz6XtLF4Q3gMJXyv6pplMtUgTr79FdBwm8V1dIHHVWvT/RD9GNHuIpfs= X-Received: by 2002:a17:906:ee86:b0:962:46d7:c8fc with SMTP id wt6-20020a170906ee8600b0096246d7c8fcmr34312168ejb.21.1684171430340; Mon, 15 May 2023 10:23:50 -0700 (PDT) MIME-Version: 1.0 References: <20230515103523.100412-1-aldyh@redhat.com> <27b0ef44-02d3-fed3-f5b6-6651d4293826@redhat.com> In-Reply-To: <27b0ef44-02d3-fed3-f5b6-6651d4293826@redhat.com> From: Aldy Hernandez Date: Mon, 15 May 2023 19:23:39 +0200 Message-ID: Subject: Re: [PATCH] Add auto-resizing capability to irange's [PR109695] To: Richard Biener Cc: GCC patches , Andrew MacLeod , Jakub Jelinek , mjambor@suse.cz X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, May 15, 2023 at 5:03=E2=80=AFPM Aldy Hernandez w= rote: > > > > On 5/15/23 13:08, Richard Biener wrote: > > On Mon, May 15, 2023 at 12:35=E2=80=AFPM Aldy Hernandez wrote: > >> > >> > >> We can now have int_range for automatically > >> resizable ranges. int_range_max is now int_range<3, true> > >> for a 69X reduction in size from current trunk, and 6.9X reduction fro= m > >> GCC12. This incurs a 5% performance penalty for VRP that is more than > >> covered by our > 13% improvements recently. > >> > >> > >> int_range_max is the temporary range object we use in the ranger for > >> integers. With the conversion to wide_int, this structure bloated up > >> significantly because wide_ints are huge (80 bytes a piece) and are > >> about 10 times as big as a plain tree. Since the temporary object > >> requires 255 sub-ranges, that's 255 * 80 * 2, plus the control word. > >> This means the structure grew from 4112 bytes to 40912 bytes. > >> > >> This patch adds the ability to resize ranges as needed, defaulting to > >> no resizing, while int_range_max now defaults to 3 sub-ranges (instead > >> of 255) and grows to 255 when the range being calculated does not fit. > >> > >> For example: > >> > >> int_range<1> foo; // 1 sub-range with no resizing. > >> int_range<5> foo; // 5 sub-ranges with no resizing. > >> int_range<5, true> foo; // 5 sub-ranges with resizing. > >> > >> I ran some tests and found that 3 sub-ranges cover 99% of cases, so > >> I've set the int_range_max default to that: > >> > >> typedef int_range<3, /*RESIZABLE=3D*/true> int_range_max; > >> > >> We don't bother growing incrementally, since the default covers most > >> cases and we have a 255 hard-limit. This hard limit could be reduced > >> to 128, since my tests never saw a range needing more than 124, but we > >> could do that as a follow-up if needed. > >> > >> With 3-subranges, int_range_max is now 592 bytes versus 40912 for > >> trunk, and versus 4112 bytes for GCC12! The penalty is 5.04% for VRP > >> and 3.02% for threading, with no noticeable change in overall > >> compilation (0.27%). This is more than covered by our 13.26% > >> improvements for the legacy removal + wide_int conversion. > > > > Thanks for doing this. > > > >> I think this approach is a good alternative, while providing us with > >> flexibility going forward. For example, we could try defaulting to a > >> 8 sub-ranges for a noticeable improvement in VRP. We could also use > >> large sub-ranges for switch analysis to avoid resizing. > >> > >> Another approach I tried was always resizing. With this, we could > >> drop the whole int_range nonsense, and have irange just hold a > >> resizable range. This simplified things, but incurred a 7% penalty on > >> ipa_cp. This was hard to pinpoint, and I'm not entirely convinced > >> this wasn't some artifact of valgrind. However, until we're sure, > >> let's avoid massive changes, especially since IPA changes are coming > >> up. > >> > >> For the curious, a particular hot spot for IPA in this area was: > >> > >> ipcp_vr_lattice::meet_with_1 (const value_range *other_vr) > >> { > >> ... > >> ... > >> value_range save (m_vr); > >> m_vr.union_ (*other_vr); > >> return m_vr !=3D save; > >> } > >> > >> The problem isn't the resizing (since we do that at most once) but the > >> fact that for some functions with lots of callers we end up a huge > >> range that gets copied and compared for every meet operation. Maybe > >> the IPA algorithm could be adjusted somehow??. > > > > Well, the above just wants to know whether the union_ operation changed > > the range. I suppose that would be an interesting (and easy to compute= ?) > > secondary output of union_ and it seems it already computes that (but > > maybe not correctly?). So I suggest to change the above to > > union_ returns a value specifically for that, which Andrew uses for > cache optimization. For that matter, your suggestion was my first > approach, but I quickly found out we were being overly pessimistic in > some cases, and I was too lazy to figure out why. > > > > > bool res; > > if (flag_checking) > > { > > value_range save (m_vr); > > res =3D m_vr.union_ (*other_vr); > > gcc_assert (res =3D=3D (m_vr !=3D save)); > > } > > else > > res =3D m_vr.union (*other_vr); > > return res; > > With your suggested sanity check I chased the problem to a minor > inconsistency when unioning nonzero masks. The issue wasn't a bug, just > a pessimization. I'm attaching a patch that corrects the oversight > (well, not oversight, everything was more expensive with trees)... It > yields a 6.89% improvement to the ipa-cp pass!!! Thanks. > > I'll push it if it passes tests. Tests passed. Pushed patch. I've also pushed the original patch in this email. We can address anything else as a follow-up. Thanks for everyone's feedback. Aldy