From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id DB6063858C39 for ; Fri, 10 Sep 2021 16:05:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DB6063858C39 Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-367-e47_jqh_OlyhrL5gXvymXw-1; Fri, 10 Sep 2021 12:05:22 -0400 X-MC-Unique: e47_jqh_OlyhrL5gXvymXw-1 Received: by mail-wr1-f70.google.com with SMTP id u30-20020adfa19e000000b00159aba4fe42so690750wru.9 for ; Fri, 10 Sep 2021 09:05:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=xxnw5LtOl2RIuJdiKctDNBBhDpulOxZLX+GqCLqTeoI=; b=1o7c4mNXo42A6QHGktJyq2v1Wz6S5iNPxj2fMvw/hsGwyCAVXle67sYBsvetqyJb75 5ZcXb3xUPdvMgrYkizpnH58q2E1qyYntLfv8M5TcRUfnmEZH7lBScmth6kvN3oiVNwXd wldZCTjgcmuyPDhb5JLDi/Ix5bYhztP4dSnYe/umEjcPsqYytN38rOx3sjTFhQVH5sC6 f/Nh1itIwUcHARTVgTvceL9vaVTXbkkuCyu0vY2AuWqd2tWhztRlOymf7YhT3r4vFNdf ARsqVsQmylZKC0XF3EMTbwdFNBpZvz+OcKqsEha0aDVtVMTd7r6ZqZbNyLO47dFmXft8 cYdw== X-Gm-Message-State: AOAM530OgKQOA3aMWU6fBoW+W1X80O/5Nt6MHdJKYJ+ZnLrqjRm6lTU+ JlUmYdT/uZmGbE4Mbv7ucTpRAjDpxvnLECgG/ocoMZ29T1s+junaQC4OKogvBLsWTmC5QMNB004 K+ssS8j4= X-Received: by 2002:adf:b741:: with SMTP id n1mr10647925wre.120.1631289921419; Fri, 10 Sep 2021 09:05:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyb/wQl72f2i3lPI+t7EiKGd3j5HcGaHYYjWDh6Hhx3o40FlYdsRndfrBHzgMJr4Au44+FWtA== X-Received: by 2002:adf:b741:: with SMTP id n1mr10647892wre.120.1631289921096; Fri, 10 Sep 2021 09:05:21 -0700 (PDT) Received: from abulafia.quesejoda.com ([139.47.33.227]) by smtp.gmail.com with ESMTPSA id n66sm4758993wmn.2.2021.09.10.09.05.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 10 Sep 2021 09:05:20 -0700 (PDT) Subject: Re: More aggressive threading causing loop-interchange-9.c regression To: Jeff Law , Richard Biener Cc: Michael Matz , GCC Mailing List , Andrew MacLeod References: <09e48b82-bc51-405e-7680-89a5f08e4e8f@redhat.com> <6d5695e4-e4eb-14a5-46a6-f425d1514008@redhat.com> <07fdd2bb-e6b7-fe66-f6a0-df6ec0704ae4@redhat.com> <8c49db8d-3119-0dc2-2bbb-4062c8d5d53b@redhat.com> From: Aldy Hernandez Message-ID: <991f9fe6-0597-4883-c114-04e01098b37a@redhat.com> Date: Fri, 10 Sep 2021 18:05:19 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.9 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Sep 2021 16:05:26 -0000 On 9/10/21 5:43 PM, Jeff Law wrote: > > > On 9/9/2021 3:21 AM, Aldy Hernandez wrote: >> >>> >>>    /* If this path does not thread through the loop latch, then we are >>>       using the FSM threader to find old style jump threads. This >>>       is good, except the FSM threader does not re-use an existing >>>       threading path to reduce code duplication. >>> >>>       So for that case, drastically reduce the number of statements >>>       we are allowed to copy.  */ >> >> *blink* >> >> Woah.  The backward threader has been using FSM threads >> indiscriminately as far as I can remember.  I wonder what would break >> if we "fixed it". > ?!?  I'm not sure what you're suggesting here.  If you s/FSM > threader/backwards threader/ in the comment does it make more sense? The > term FSM really should largely have been dropped as the backwards > threader was improved to handle more cases. back_threader_registry::register_path() uses EDGE_FSM_THREAD as the thread type to register threads. I was wondering if it should have been some combination of EDGE_START_JUMP_THREAD / EDGE_*_COPY_SRC_BLOCK, etc. I (purposely) know nothing about the underlying threading types ;-). But if the backwards threader has been improved, then perhaps we should just remove the confusing FSM references. > > > > >> >>> >>> so these cases should use the "old style" validity/costing metrics >>> and thus >>> classify threading opportunities in a different way? >> >> Jeff, do you have any insight here? > This is precisely what you're cleaning up. > > >> >>> >>> I think today "backwards" vs, "forwards" only refers to the way we find >>> threading opportunities. >> >> Yes, it's a mess. >> >> I ran some experiments a while back, and my current work on the >> enhanced solver/threader, can fold virtually everything the >> DOM/threader gets (even with its use of const_and_copies, avail_exprs, >> and evrp_range_analyzer), while getting 5% more DOM threads and 1% >> more overall threads.  That is, I've been testing if the path solver >> can solve everything the DOM threader needs (the hybrid approach I >> mentioned). >> >> Unfortunately, replacing the forward threader right now is not >> feasible for a few reasons: > Right.  But I thought the short term goal was to replace/remove the > forward threading from VRP.   Dropping from DOM is going to be tougher. My current thinking is that replacing the forward VRP threader with a hybrid one is a gentler approach to the longer term goal of replacing the forward threader altogether. However, all the work I've been doing could go either way-- we could try the forward/VRP replacement or a hybrid approach. It will all use the path solver underneath. My main problem with replacing the forward/VRP with a backward client is that the cost models are so different that it was difficult to compare how we fared. I constantly ran into threads the solver could handle just fine, but profitable_path_p was holding it back. FWIW, we get virtually everything the forward threader gets, minus a very few things. At least when I plug in the solver to the DOM/forwarder threader, it can solve everything it can (minus noise and floats). If you prefer a backward threader instance to replace the VRP/forward threader, I'm game. It's just harder to compare. Either way (backward threader or a hybrid forward+solver) uses the same underlying solver which is solid. >> a) The const_and_copies/avail_exprs relation framework can do floats, >> and that's next year's ranger work. > Right.  I'd actually run into this as well when I wanted to drop all the > range bits out of DOM and rely exclusively on EVRP.   It'd still be a > step forward to rip out the EVRP engine from DOM and simplify all the > code that derives one equivalence from another so that it's only working > on FP values. Sure. > >> >> b) Even though we can seemingly fold everything DOM/threader does, in >> order to replace it with a backward threader instance we'd have to >> merge the cost/profitability code scattered throughout the forward >> threader, as well as the EDGE_FSM* / EDGE_COPY* business. > Right.  This is a prerequisite.  Though some of the costing will need to > be conditional on the threader being used.  Refer back to the discussion > around how the forward threader can commonize thread paths that lead to > the same destination while the backwards threader can not. Yup, yup. > >> >> c) DOM changes the IL as it goes.  Though we could conceivably divorce >> do the threading after DOM is done. > The only reason threading runs in parallel with DOM is so that it can > use the context sensitive equivalences.  With the infrastructure you're > building, there's a reasonable chance we can move to a model where we > run DOM (and in the long term a simpler DOM) and threading as distinct, > independent passes. Andrew mumbled something about replacing all of DOM eventually :-). Well, except that value-numbering business I bet. Aldy