From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 2F7F03857C7C for ; Tue, 2 Aug 2022 10:21:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2F7F03857C7C Received: from mail-ot1-f71.google.com (mail-ot1-f71.google.com [209.85.210.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-203-JIjA8GagOLOihjzh3mwdFg-1; Tue, 02 Aug 2022 06:21:50 -0400 X-MC-Unique: JIjA8GagOLOihjzh3mwdFg-1 Received: by mail-ot1-f71.google.com with SMTP id k60-20020a9d19c2000000b0061ca0ced506so5938874otk.23 for ; Tue, 02 Aug 2022 03:21:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=2ADOZfpTBinkLRMVxK1Kx6DJR0DhQ+IIF0UVyQXAWsU=; b=gGhDRW/e7zmJuhrMPeOk56RNJ/KJBz7KmSkTLVZ5fD20IIWZ0okUAUjXEf7FybzhG0 dzHihAASlD169bVMJxxbbZlT2EDSOLeNRiqM290A1eoCbV4U7W9WodlV+4Sz00UuUIMH OsmoitcMlDIugJ9p8ul+p6m5F7REeosdmJqs1H7JvHYHBxmEP9wsaNg3lGJ3lpj5hO3Z cVhM9nxjgx4+T+BBFh0+qsQp72dWwFCIQ37tsxTqTmyHZv3yNtcUj8o6sZv8npym6cLb K3al3YmdJBJZr0MVlGBcbao8wgwAtUzfETQTTpV+NJZ3iOZNiucrYytzKDmgYrM5V/zp l9NQ== X-Gm-Message-State: AJIora9V0/xpVRqzMh7XfcdpxijJnoBtiVC//jsmeLsABDI882S3E8iE 59pdmnZ7Z62+tzrdueDwNlPv1Bar4Cvh1axd5pBENd+qx9+V04/kwTkklkEtc5ZThRT8v0oSIK6 WzqSBlyO3hp/HMN/JOWLgY4klERq7Qst6+Q== X-Received: by 2002:a05:6830:44a3:b0:61c:bcc5:f073 with SMTP id r35-20020a05683044a300b0061cbcc5f073mr7306803otv.335.1659435709294; Tue, 02 Aug 2022 03:21:49 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vKO1+jIVJWNlkvh67jV1fzPzbkQi9TG8ax7E09W56IiNORn3QXo2+TBYvMj2eidS/AP3ZqK9B1aSAd2cfWS3U= X-Received: by 2002:a05:6830:44a3:b0:61c:bcc5:f073 with SMTP id r35-20020a05683044a300b0061cbcc5f073mr7306756otv.335.1659435707012; Tue, 02 Aug 2022 03:21:47 -0700 (PDT) MIME-Version: 1.0 References: <04261.122080204410800126@us-mta-529.us.mimecast.lan> In-Reply-To: <04261.122080204410800126@us-mta-529.us.mimecast.lan> From: Aldy Hernandez Date: Tue, 2 Aug 2022 12:21:36 +0200 Message-ID: Subject: Re: [PATCH] Properly honor param_max_fsm_thread_path_insns in backwards threader To: Richard Biener Cc: gcc-patches , "MacLeod, Andrew" , Jeff Law X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Aug 2022 10:21:56 -0000 Unfortunately, this was before my time, so I don't know. That being said, thanks for tackling these issues that my work triggered last release. Much appreciated. Aldy On Tue, Aug 2, 2022 at 10:41 AM Richard Biener wrote: > > I am trying to make sense of back_threader_profitability::profitable_path_p > and the first thing I notice is that we do > > /* Threading is profitable if the path duplicated is hot but also > in a case we separate cold path from hot path and permit optimization > of the hot path later. Be on the agressive side here. In some testcases, > as in PR 78407 this leads to noticeable improvements. */ > if (m_speed_p > && ((taken_edge && optimize_edge_for_speed_p (taken_edge)) > || contains_hot_bb)) > { > if (n_insns >= param_max_fsm_thread_path_insns) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, " FAIL: Jump-thread path not considered: " > "the number of instructions on the path " > "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n"); > return false; > } > ... > } > else if (!m_speed_p && n_insns > 1) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, " FAIL: Jump-thread path not considered: " > "duplication of %i insns is needed and optimizing for size.\n", > n_insns); > return false; > } > ... > return true; > > thus we apply the n_insns >= param_max_fsm_thread_path_insns only > to "hot paths". The comment above this isn't entirely clear whether > this is by design ("Be on the aggressive side here ...") but I think > this is a mistake. In fact the "hot path" check seems entirely > useless since if the path is not hot we simply continue threading it. > > I have my reservations about how we compute hot (contains_hot_bb > in particular), but the following first refactors the above to apply > the size constraints always and then _not_ threading if the path > is not considered hot (but allow threading if n_insns <= 1 as with > the !m_speed_p case). > > As for contains_hot_bb - it might be that this consciously captures > the case where we separate a cold from a hot path even though the > threaded path itself is cold. Consider > > A > / \ (unlikely) > B C > \ / > D > / \ > .. abort() > > when we want to thread A -> B -> D -> abort () and A (or D) > has a hot BB count then we have contains_hot_bb even though > the counts on the path itself are small. In fact when we > thread the only relevant count for the resulting threaded > path is the count of A with the A->C probability applied > (that should also the count to subtract from the blocks > we copied - sth missing for the backwards threader as well). > > So I'm wondering how the logic computing contains_hot_bb > relates to the above comment before the costing block. > Anyone remembers? > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > * tree-ssa-threadbackwards.cc > (back_threader_profitability::profitable_path_p): Apply > size constraints to all paths. Do not thread cold paths. > --- > gcc/tree-ssa-threadbackward.cc | 53 +++++++++++++++++++++------------- > 1 file changed, 33 insertions(+), 20 deletions(-) > > diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc > index 0519f2a8c4b..a887568032b 100644 > --- a/gcc/tree-ssa-threadbackward.cc > +++ b/gcc/tree-ssa-threadbackward.cc > @@ -761,22 +761,43 @@ back_threader_profitability::profitable_path_p (const vec &m_path, > *creates_irreducible_loop = true; > } > > - /* Threading is profitable if the path duplicated is hot but also > + const int max_cold_insns = 1; > + if (!m_speed_p && n_insns > max_cold_insns) > + { > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, " FAIL: Jump-thread path not considered: " > + "duplication of %i insns is needed and optimizing for size.\n", > + n_insns); > + return false; > + } > + else if (n_insns >= param_max_fsm_thread_path_insns) > + { > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, " FAIL: Jump-thread path not considered: " > + "the number of instructions on the path " > + "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n"); > + return false; > + } > + > + /* Threading is profitable if the path duplicated is small or hot but also > in a case we separate cold path from hot path and permit optimization > of the hot path later. Be on the agressive side here. In some testcases, > as in PR 78407 this leads to noticeable improvements. */ > - if (m_speed_p > - && ((taken_edge && optimize_edge_for_speed_p (taken_edge)) > - || contains_hot_bb)) > + if (!(n_insns <= max_cold_insns > + || contains_hot_bb > + || (taken_edge && optimize_edge_for_speed_p (taken_edge)))) > + { > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, " FAIL: Jump-thread path not considered: " > + "path is not profitable to thread.\n"); > + return false; > + } > + > + /* If the path is not small to duplicate and either the entry or > + the final destination is probably never executed avoid separating > + the cold path since that can lead to spurious diagnostics. */ > + if (n_insns > max_cold_insns) > { > - if (n_insns >= param_max_fsm_thread_path_insns) > - { > - if (dump_file && (dump_flags & TDF_DETAILS)) > - fprintf (dump_file, " FAIL: Jump-thread path not considered: " > - "the number of instructions on the path " > - "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n"); > - return false; > - } > if (taken_edge && probably_never_executed_edge_p (cfun, taken_edge)) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > @@ -794,14 +815,6 @@ back_threader_profitability::profitable_path_p (const vec &m_path, > return false; > } > } > - else if (!m_speed_p && n_insns > 1) > - { > - if (dump_file && (dump_flags & TDF_DETAILS)) > - fprintf (dump_file, " FAIL: Jump-thread path not considered: " > - "duplication of %i insns is needed and optimizing for size.\n", > - n_insns); > - return false; > - } > > /* We avoid creating irreducible inner loops unless we thread through > a multiway branch, in which case we have deemed it worth losing > -- > 2.35.3 >