From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 0B07C385B513 for ; Mon, 28 Nov 2022 09:54:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0B07C385B513 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.cz Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id D07741FD8A; Mon, 28 Nov 2022 09:53:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1669629238; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6nRn51r9bx5aIu8fICcdqWokC4gdXP5K7O6pFAI85gg=; b=Aku4GGkkL95RySf8jVfEAI9g4ud9pt9Z44/R8kB6YL9jkmcAgSsZkIi+X0h9w3OOjn7RTp q1nC74aO7V767nsjhEkiD2n1c863v0Z6ZMtUiOr/kt1y7pca2xnjhmFgP7OYEmeBXF73TL 7KG7E8/mntzjpnX7kBvtnREXbzxCRMA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1669629238; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6nRn51r9bx5aIu8fICcdqWokC4gdXP5K7O6pFAI85gg=; b=HTYVl042G+QyLeCi9uo1zttqZwtoT7GKJRYyv96H8HFbq/Vs4TkjKKmbDxA17RfhvjbiD8 ky9pI+wh7O9UIICw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BD1DD1326E; Mon, 28 Nov 2022 09:53:58 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id NX88LTaFhGM6DQAAMHmgww (envelope-from ); Mon, 28 Nov 2022 09:53:58 +0000 Message-ID: <7eb66e9c-e412-73ea-136c-6788b69daab8@suse.cz> Date: Mon, 28 Nov 2022 10:53:58 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.0 Subject: Re: [PATCH] Improve profile handling in switch lowering. Content-Language: en-US From: =?UTF-8?Q?Martin_Li=c5=a1ka?= To: gcc-patches@gcc.gnu.org Cc: Jan Hubicka References: <596c4483-864c-1d03-955c-69906171f037@suse.cz> <8e839057-51b4-a036-c9f2-8049e41d5814@suse.cz> <07de3a4b-0250-fbee-dbe6-db16728a0e27@suse.cz> In-Reply-To: <07de3a4b-0250-fbee-dbe6-db16728a0e27@suse.cz> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: PING^4 On 4/7/22 16:00, Martin Liška wrote: > PING^3 > > On 3/24/22 11:22, Martin Liška wrote: >> PING^2 >> >> On 3/4/22 14:47, Martin Liška wrote: >>> PING^1 >>> >>> On 1/26/22 12:11, Martin Liška wrote: >>>> Hello. >>>> >>>> Right now, switch lowering does not update basic_block::count values >>>> so that they are uninitiliazed. Moreover, I've updated probability scaling >>>> when a more complex expansion happens. There are still some situations where >>>> the profile is a bit imprecise, but the patch improves rapidly the current situation. >>>> >>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. >>>> >>>> Ready to be installed? >>>> Thanks, >>>> Martin >>>> >>>>      PR tree-optimization/101301 >>>>      PR tree-optimization/103680 >>>> >>>> gcc/ChangeLog: >>>> >>>>      * tree-switch-conversion.cc (bit_test_cluster::emit): >>>>      Handle correctly remaining probability. >>>>      (switch_decision_tree::try_switch_expansion): Fix BB's count >>>>      where a cluster expansion happens. >>>>      (switch_decision_tree::emit_cmp_and_jump_insns): Fill up also >>>>      BB count. >>>>      (switch_decision_tree::do_jump_if_equal): Likewise. >>>>      (switch_decision_tree::emit_case_nodes): Handle special case >>>>      for BT expansion which can also fallback to a default BB. >>>>      * tree-switch-conversion.h (cluster::cluster): Add >>>>      m_default_prob probability. >>>> --- >>>>   gcc/tree-switch-conversion.cc | 51 ++++++++++++++++++++++++----------- >>>>   gcc/tree-switch-conversion.h  |  8 +++++- >>>>   2 files changed, 43 insertions(+), 16 deletions(-) >>>> >>>> diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc >>>> index 670397c87e4..d6679e8dee3 100644 >>>> --- a/gcc/tree-switch-conversion.cc >>>> +++ b/gcc/tree-switch-conversion.cc >>>> @@ -1538,10 +1538,12 @@ bit_test_cluster::emit (tree index_expr, tree index_type, >>>>         test[k].target_bb = n->m_case_bb; >>>>         test[k].label = n->m_case_label_expr; >>>>         test[k].bits = 0; >>>> +      test[k].prob = profile_probability::never (); >>>>         count++; >>>>       } >>>> >>>>         test[k].bits += n->get_range (n->get_low (), n->get_high ()); >>>> +      test[k].prob += n->m_prob; >>>> >>>>         lo = tree_to_uhwi (int_const_binop (MINUS_EXPR, n->get_low (), minval)); >>>>         if (n->get_high () == NULL_TREE) >>>> @@ -1629,6 +1631,11 @@ bit_test_cluster::emit (tree index_expr, tree index_type, >>>>                     /*simple=*/true, NULL_TREE, >>>>                     /*before=*/true, GSI_SAME_STMT); >>>> >>>> +  profile_probability subtree_prob = m_subtree_prob; >>>> +  profile_probability default_prob = m_default_prob; >>>> +  if (!default_prob.initialized_p ()) >>>> +    default_prob = m_subtree_prob.invert (); >>>> + >>>>     if (m_handles_entire_switch && entry_test_needed) >>>>       { >>>>         tree range = int_const_binop (MINUS_EXPR, maxval, minval); >>>> @@ -1639,9 +1646,10 @@ bit_test_cluster::emit (tree index_expr, tree index_type, >>>>                       /*simple=*/true, NULL_TREE, >>>>                       /*before=*/true, GSI_SAME_STMT); >>>>         tmp = fold_build2 (GT_EXPR, boolean_type_node, idx, range); >>>> +      default_prob = default_prob.apply_scale (1, 2); >>>>         basic_block new_bb >>>>       = hoist_edge_and_branch_if_true (&gsi, tmp, default_bb, >>>> -                     profile_probability::unlikely ()); >>>> +                     default_prob); >>>>         gsi = gsi_last_bb (new_bb); >>>>       } >>>> >>>> @@ -1662,14 +1670,12 @@ bit_test_cluster::emit (tree index_expr, tree index_type, >>>>     else >>>>       csui = tmp; >>>> >>>> -  profile_probability prob = profile_probability::always (); >>>> - >>>>     /* for each unique set of cases: >>>>          if (const & csui) goto target  */ >>>>     for (k = 0; k < count; k++) >>>>       { >>>> -      prob = profile_probability::always ().apply_scale (test[k].bits, >>>> -                             bt_range); >>>> +      profile_probability prob = test[k].prob / (subtree_prob + default_prob); >>>> +      subtree_prob -= test[k].prob; >>>>         bt_range -= test[k].bits; >>>>         tmp = wide_int_to_tree (word_type_node, test[k].mask); >>>>         tmp = fold_build2 (BIT_AND_EXPR, word_type_node, csui, tmp); >>>> @@ -1908,9 +1914,13 @@ switch_decision_tree::try_switch_expansion (vec &clusters) >>>>         /* Emit cluster-specific switch handling.  */ >>>>         for (unsigned i = 0; i < clusters.length (); i++) >>>>       if (clusters[i]->get_type () != SIMPLE_CASE) >>>> -      clusters[i]->emit (index_expr, index_type, >>>> -                 gimple_switch_default_label (m_switch), >>>> -                 m_default_bb, gimple_location (m_switch)); >>>> +      { >>>> +        edge e = single_pred_edge (clusters[i]->m_case_bb); >>>> +        e->dest->count = e->src->count.apply_probability (e->probability); >>>> +        clusters[i]->emit (index_expr, index_type, >>>> +                   gimple_switch_default_label (m_switch), >>>> +                   m_default_bb, gimple_location (m_switch)); >>>> +      } >>>>       } >>>> >>>>     fix_phi_operands_for_edges (); >>>> @@ -2162,6 +2172,7 @@ switch_decision_tree::emit_cmp_and_jump_insns (basic_block bb, tree op0, >>>>     edge false_edge = split_block (bb, cond); >>>>     false_edge->flags = EDGE_FALSE_VALUE; >>>>     false_edge->probability = prob.invert (); >>>> +  false_edge->dest->count = bb->count.apply_probability (prob.invert ()); >>>> >>>>     edge true_edge = make_edge (bb, label_bb, EDGE_TRUE_VALUE); >>>>     true_edge->probability = prob; >>>> @@ -2192,6 +2203,7 @@ switch_decision_tree::do_jump_if_equal (basic_block bb, tree op0, tree op1, >>>>     edge false_edge = split_block (bb, cond); >>>>     false_edge->flags = EDGE_FALSE_VALUE; >>>>     false_edge->probability = prob.invert (); >>>> +  false_edge->dest->count = bb->count.apply_probability (prob.invert ()); >>>> >>>>     edge true_edge = make_edge (bb, label_bb, EDGE_TRUE_VALUE); >>>>     true_edge->probability = prob; >>>> @@ -2227,7 +2239,7 @@ switch_decision_tree::emit_case_nodes (basic_block bb, tree index, >>>>                    node->m_c->m_case_bb, p, loc); >>>>         /* Since this case is taken at this point, reduce its weight from >>>>        subtree_weight.  */ >>>> -      node->m_c->m_subtree_prob -= p; >>>> +      node->m_c->m_subtree_prob -= node->m_c->m_prob; >>>> >>>>         if (node->m_left != NULL && node->m_right != NULL) >>>>       { >>>> @@ -2246,6 +2258,7 @@ switch_decision_tree::emit_case_nodes (basic_block bb, tree index, >>>>              / (node->m_c->m_subtree_prob + default_prob)); >>>>             bb = do_jump_if_equal (bb, index, node->m_right->m_c->get_low (), >>>>                        node->m_right->m_c->m_case_bb, p, loc); >>>> +          node->m_c->m_subtree_prob -= node->m_right->m_c->m_prob; >>>> >>>>             p = (node->m_left->m_c->m_prob >>>>              / (node->m_c->m_subtree_prob + default_prob)); >>>> @@ -2262,6 +2275,7 @@ switch_decision_tree::emit_case_nodes (basic_block bb, tree index, >>>>             p = ((node->m_right->m_c->m_subtree_prob >>>>               + default_prob.apply_scale (1, 2)) >>>>              / (node->m_c->m_subtree_prob + default_prob)); >>>> +          test_bb->count = bb->count.apply_probability (p); >>>>             bb = emit_cmp_and_jump_insns (bb, index, node->m_c->get_high (), >>>>                           GT_EXPR, test_bb, p, loc); >>>>             default_prob = default_prob.apply_scale (1, 2); >>>> @@ -2348,21 +2362,28 @@ switch_decision_tree::emit_case_nodes (basic_block bb, tree index, >>>>        is the one to branch to.  */ >>>>         if (node->has_child () || node->m_c->get_type () != SIMPLE_CASE) >>>>       { >>>> +      bool is_bt = node->m_c->get_type () == BIT_TEST; >>>> +      int parts = is_bt ? 3 : 2; >>>> + >>>>         /* Branch to a label where we will handle it later.  */ >>>>         basic_block test_bb = split_edge (single_succ_edge (bb)); >>>>         redirect_edge_succ (single_pred_edge (test_bb), >>>>                     single_succ_edge (bb)->dest); >>>> >>>> - >>>> -       profile_probability right_prob = profile_probability::never (); >>>> -       if (node->m_right) >>>> -         right_prob = node->m_right->m_c->m_subtree_prob; >>>> -      p = ((right_prob + default_prob.apply_scale (1, 2)) >>>> +      profile_probability right_prob = profile_probability::never (); >>>> +      if (node->m_right) >>>> +        right_prob = node->m_right->m_c->m_subtree_prob; >>>> +      p = ((right_prob + default_prob.apply_scale (1, parts)) >>>>              / (node->m_c->m_subtree_prob + default_prob)); >>>> +      test_bb->count = bb->count.apply_probability (p); >>>> >>>>         bb = emit_cmp_and_jump_insns (bb, index, node->m_c->get_high (), >>>>                       GT_EXPR, test_bb, p, loc); >>>> -      default_prob = default_prob.apply_scale (1, 2); >>>> + >>>> +      default_prob = default_prob.apply_scale (1, parts); >>>> +      node->m_c->m_subtree_prob -= right_prob; >>>> +      if (is_bt) >>>> +        node->m_c->m_default_prob = default_prob; >>>> >>>>         /* Value belongs to this node or to the left-hand subtree.  */ >>>>         p = node->m_c->m_prob / (node->m_c->m_subtree_prob + default_prob); >>>> diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h >>>> index e969c051a05..5f3f99353ba 100644 >>>> --- a/gcc/tree-switch-conversion.h >>>> +++ b/gcc/tree-switch-conversion.h >>>> @@ -102,6 +102,10 @@ public: >>>>     /* Probability of reaching subtree rooted at this node.  */ >>>>     profile_probability m_subtree_prob; >>>> >>>> +  /* Probability of default case when reaching the node. >>>> +     It is used by bit-test right now.  */ >>>> +  profile_probability m_default_prob; >>>> + >>>>   protected: >>>>     /* Default constructor.  */ >>>>     cluster () {} >>>> @@ -110,7 +114,8 @@ protected: >>>>   cluster::cluster (tree case_label_expr, basic_block case_bb, >>>>             profile_probability prob, profile_probability subtree_prob): >>>>     m_case_label_expr (case_label_expr), m_case_bb (case_bb), m_prob (prob), >>>> -  m_subtree_prob (subtree_prob) >>>> +  m_subtree_prob (subtree_prob), >>>> +  m_default_prob (profile_probability::uninitialized ()) >>>>   { >>>>   } >>>> >>>> @@ -542,6 +547,7 @@ public: >>>>     basic_block target_bb; >>>>     tree label; >>>>     int bits; >>>> +  profile_probability prob; >>>> >>>>     /* Comparison function for qsort to order bit tests by decreasing >>>>        probability of execution.  */ >>> >> >