From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x834.google.com (mail-qt1-x834.google.com [IPv6:2607:f8b0:4864:20::834]) by sourceware.org (Postfix) with ESMTPS id 29CA7385AC38 for ; Tue, 5 Oct 2021 16:08:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 29CA7385AC38 Received: by mail-qt1-x834.google.com with SMTP id r1so1723291qta.12 for ; Tue, 05 Oct 2021 09:08:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language; bh=xWM1Q7lLdnZ3E0u7V8iWkE9PbQ+D6YgwG2Mz/5Yo2Fk=; b=ttAdk+uNS6LohnWU7hJUWyBb78GKeaIyEUIWsUj45lFlkEjr2uBZ48+g5a+yVpuK8b 7XCZew3k/GjquyQaH9pdANmGtlH1nFJSrzMB2uEg05sOTqsWCC2/0rDYU18M+lCpZyBh yQe35QR1e6SMWnpkjCCTM9Qp1mZKiEeCNZgSRAHpf8vIZr+Eghj930CIflaiQmtVdHSk mwx1kL+GI+NL6wuwSVbh17XB833Guh16KBStU9DyIFUXb1pqubhE434YDZFw6sUYgdcz JSdcJ5/NhFdOBJVffEp7B1ZIvjQcgyL5z/GT8+wTnfbY/jFpb4uG1uolTvnrJ3hOJeHv YXJg== X-Gm-Message-State: AOAM533LduudFA2t1ypqfitDhYIuz9hNHRFk9hT+Ut5zggFXqZZr/2qY Hcs2oPVl0ovnVWoBqf7IpkAAUKIzldE= X-Google-Smtp-Source: ABdhPJx8gy60gkBS+IkpTTZvL2NPw5yQrFRVYERKlKF0nvI7AtSyGP7YxBDFjjX/Ntsi64IoGV5F9w== X-Received: by 2002:ac8:4585:: with SMTP id l5mr20074743qtn.93.1633450124342; Tue, 05 Oct 2021 09:08:44 -0700 (PDT) Received: from [172.31.0.175] (c-98-202-48-222.hsd1.ut.comcast.net. [98.202.48.222]) by smtp.gmail.com with ESMTPSA id o9sm9773813qko.91.2021.10.05.09.08.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 05 Oct 2021 09:08:43 -0700 (PDT) Subject: Re: [RFC] More jump threading restrictions in the presence of loops. To: Aldy Hernandez , Michael Matz Cc: Richard Biener , Andrew MacLeod , GCC patches References: <20211004094313.1596556-1-aldyh@redhat.com> From: Jeff Law Message-ID: Date: Tue, 5 Oct 2021 10:08:42 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, HTML_MESSAGE, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Oct 2021 16:08:47 -0000 On 10/5/2021 7:33 AM, Aldy Hernandez wrote: > > OK? > > BTW Jeff, this fixes all the regressions you mention except: > > 1. pr68911.c > > The path not being threaded here is 7->10->11->12. It crosses loops > multiple times, so I believe the restriction code is correct. > > 7, 10, 12 are in loop1. > 11 is in loop 2. > > So we have a path going from loop1 -> loop2 -> loop1. I can't > conceive any scenario where this is ok, but I can attach the entire IL > if I'm missing something. Wow.  I'm having trouble seeing how we were threading that, but clearly not OK. > > 2. 961125-1.c > > This one is a bit trickier. Here we're trying to thread the following > conditional, which I mentioned when I contributed this work, we don't > handle (and happens infrequently enough not to matter): Sorry, I'd forgotten about this case. > > + // Loop 4 > + [local count: 114863531]: > + # ptr_8 = PHI > + if (ptr_8 < &MEM [(void *)":ab" + 3B]) > + goto ; [50.00%] > + else > + goto ; [50.00%] > > The hybrid threader doesn't handle &MEM in the final conditional. As > I mentioned earlier, if this becomes an issue, we can adapt class > pointer_equiv_analyzer like we did for evrp. I have a gimple FE test > I will contribute as an XFAIL with an associated PR to keep us honest. > > That being said... in this particular test, this is all irrelevant > because the path will be disallowed for two reasons: > > a) The path crosses loops, and the reason we didn't realize it in the > old code was because the ASSERT_EXPR had pulled the SSA outside the > loop, so it looks like the entire path is l in the same loop. If you > look at the original IL, it's not. > > b) Now the path actually fits the pattern being discussed in this > patch, where there's an early exit out of a loop, so it looks like we > should handle it. But...in this case, we would fill a presently empty > latch. Interestingly, the old code didn't catch it, because....you > guessed it...there was an ASSERT_EXPR in the latch. > > So I argue that even in the absence of MEM_REF handling, we shouldn't > thread this path. I can live with that too.  I'll set new baselines for visium-elf so that these don't show up again. > > 0001-Loosen-loop-crossing-restriction-in-threader.patch > > From 5abe65668f602d53b8f3dc515508dc4616beb048 Mon Sep 17 00:00:00 2001 > From: Aldy Hernandez > Date: Tue, 5 Oct 2021 15:03:34 +0200 > Subject: [PATCH] Loosen loop crossing restriction in threader. > > Crossing loops is generally discouraged from the threader, but we can > make an exception when we don't cross the latch or enter another loop, > since this is just an early exit out of the loop. > > In fact, the whole threaded path is logically outside the loop. This > has nice secondary effects. For example, objects on the threaded path > will no longer necessarily be live throughout the loop, so we can get > register allocation improvements. The threaded path can physically > move outside the loop resulting in better icache efficiency, etc. > > Tested on x86-64 Linux, and on a visium-elf cross making sure that the > following tests do not have an abort in the final assembly: > > gcc.c-torture/execute/960218-1.c > gcc.c-torture/execute/visium-pending-4.c > gcc.c-torture/execute/pr58209.c > > gcc/ChangeLog: > > * tree-ssa-threadupdate.c (jt_path_registry::cancel_invalid_paths): > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/ssa-thread-valid.c: New test. OK.  I think there may still be some concern if the latch is marked as a joiner.    I think we should always reject those before the loop optimizers run as the joiner's clone would introduce a second latch.   I think that can be handled in a follow-up refinement. > > Co-authored-by: Jeff Law I'm not sure I deserve co-authorship :-)  You're doing all the work, I'm just reporting the regressions. jeff