From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 70EEB3858C00 for ; Thu, 4 Aug 2022 07:08:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 70EEB3858C00 Received: from mail-oa1-f72.google.com (mail-oa1-f72.google.com [209.85.160.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-401-6xTKZPaDMk6Ra9rqyyYQFw-1; Thu, 04 Aug 2022 03:08:12 -0400 X-MC-Unique: 6xTKZPaDMk6Ra9rqyyYQFw-1 Received: by mail-oa1-f72.google.com with SMTP id 586e51a60fabf-10e7667b972so6462386fac.12 for ; Thu, 04 Aug 2022 00:08:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=iVqLBTveVQ1NamqZuqvIuHPCgFNXIYtzPoGE0hqBRs8=; b=MQRMLQadCcd8GjgX2z+YzZxzlWJ3Uhgqnd2WhTG29SyIX0D550coXmElXYG5/r3o0k k2MidAruofpKhzFnmk3OerVf426sihaQBLs9Xz1FxTuLi8R1s14SN/C+WrgBnnqqCoZd iTWCiYbBDjwoUBNur7CIUHnIZCd3Okh9NF+GXHk0mto08BUMiZZtjwJzEI+fqao0Hmji kksRVZPMJHlAI5WC/hhYfpjRqixBtrATtKt1Ay/cU+q7Wr7buQW6IikAQ7Xh71UGXV10 C5yTMSDRDhHguzQP98ei5RLF+RwuUJvEKgsQVTt3TU5P+vw2etvNeAkUpLPz3dAmZJgO UPJA== X-Gm-Message-State: ACgBeo148dXJvA9M/T0Wb8jmlCFAtcGga7kbq+kPI7G9FdSjYcbnSthE NGAkgZufNVVucHHNNiP3b3QyTNDR+Xjan31HKeXrBdBA+VCWbibzNcZ58gUrGE64kTNPw7u6fLk Xuw9lYQ3K+enrhsUB43jTgr/jKDyhbrGYIg== X-Received: by 2002:a4a:894e:0:b0:440:b743:c7d with SMTP id g14-20020a4a894e000000b00440b7430c7dmr221362ooi.86.1659596891185; Thu, 04 Aug 2022 00:08:11 -0700 (PDT) X-Google-Smtp-Source: AA6agR7ZxlrGulBN2Akcp7ZUEv16bVm8pcn/ZGXyW5UCcH5yoFd6p0TLUyEEQY9+yQONo1QngfWXlfgvCkNI3qSIzsU= X-Received: by 2002:a4a:894e:0:b0:440:b743:c7d with SMTP id g14-20020a4a894e000000b00440b7430c7dmr221352ooi.86.1659596890901; Thu, 04 Aug 2022 00:08:10 -0700 (PDT) MIME-Version: 1.0 References: <45651.122080305533000620@us-mta-640.us.mimecast.lan> In-Reply-To: <45651.122080305533000620@us-mta-640.us.mimecast.lan> From: Aldy Hernandez Date: Thu, 4 Aug 2022 09:08:00 +0200 Message-ID: Subject: Re: [PATCH] Backwards threader greedy search TLC To: Richard Biener Cc: gcc-patches , "MacLeod, Andrew" X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-6.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Aug 2022 07:08:14 -0000 On Wed, Aug 3, 2022 at 11:53 AM Richard Biener wrote: > > I've tried to understand how the greedy search works seeing the > bitmap dances and the split into resolve_phi. I've summarized > the intent of the algorithm as > > // For further greedy searching we want to remove interesting > // names defined in BB but add ones on the PHI edges for the > // respective edges. > > but the implementation differs in detail. In particular when > there is more than one interesting PHI in BB it seems to only consider > the first for translating defs across edges. It also only applies > the loop crossing restriction when there is an interesting PHI. > I've also noticed that while the set of interesting names is rolled > back, m_imports just keeps growing - is that a bug? I've never quite liked how I was handling PHIs. I had some improvements in this space, especially the problem with only considering the first def across a PHI, but I ran out of time last cycle, and we were already into stage3. The loop crossing restriction I inherited from the original implementation, though I suppose I restricted it further by only looking at interesting PHIs. In practice I don't think it mattered, because we cap loop crossing violations in cancel_invalid_paths in the registry. Does anything break if you keep the restriction across the board? Good spotting on the imports growing. Instinctively that seems like a bug. [As a suggestion, check the total threadable paths as a sanity check if you make any changes in this space. What I've been doing is counting "Jumps threaded" in the *.stat dump file across the .ii files in a bootstrap, and making sure you're not getting the same # of threads because we missed one in the backward threader which then got picked up by DOM. I divide up the threads by ethread, thread, threadfull, DOM, etc, to make sure I'm not just shuffling threading opportunities around.] But yeah, we shouldn't be growing imports unnecessarily, if nothing else because of the bitmap explosion you're noticing in other places. In practice, I'm not so sure it slows the solver itself down: // They are hints for the solver. Adding more elements doesn't slow // us down, because we don't solve anything that doesn't appear in the // path. On the other hand, not having enough imports will limit what // we can solve. But that comment may be old ;-). > > The following preserves the loop crossing restriction to the case > of interesting PHIs but merges resolve_phi back, changing interesting > as outlined with the intent above. It should get more threading > cases when there are multiple interesting PHI defs in a block. > It might be a bit faster due to less bitmap operations but in the > end the main intent was to make what happens more obvious. Sweet! > > Bootstrap and regtest pending on x86_64-unknown-linux-gnu. > > Aldy - you wrote the existing implementation, is the following OK? I'm tickled pink you're cleaning and improving this. Looks good. Thanks. Aldy