* [PATCH AutoFDO/2]Treat ZERO as common profile probability/count @ 2018-10-31 8:33 bin.cheng 2018-10-31 9:43 ` Richard Biener 2018-10-31 15:02 ` Jeff Law 0 siblings, 2 replies; 15+ messages in thread From: bin.cheng @ 2018-10-31 8:33 UTC (permalink / raw) To: gcc-patches [-- Attachment #1: Type: text/plain, Size: 1423 bytes --] Hi, In new profile probability/count infra, we have different precision quality categories, and probabilities/counts of different categories are not supposed to be compared or calculated. Though in general is an improvement, it introduces unexpected behavior. Specifically, class profile_probablity and profile_count themselves are implemented by comparing probabilities/counts against profile_count::zero(). while zero() is of profile_precision category, it's always compared different to zero of other precision categories including afdo. I can see two ways fixing this: 1) Treat zero as a common probability/count regardless of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison against probability_count::zero(). 2) requires lots of code changes so I went with 1) in this patch set. This patch doesn't handle "always" but it might be. This patch also corrects a minor issue where we try to invert an uninitialized value. Bootstrap and test on x86_64 in patch set. Is it OK? Thanks, bin 2018-10-31 Bin Cheng <bin.cheng@linux.alibaba.com> * expmed.c (emit_store_flag_force): Use profile_probability::always. * profile-count.h (profile_probability::always): Add parameter. (profile_probability::operator==, profile_count::operator==): Treat ZERO as common probability/count regardless of its quality. (profile_probability::invert): Don't invert uninitialized probability. [-- Attachment #2: 0002-Handle-ZERO-profile-count-prob-as-a-general-value-fo.patch --] [-- Type: application/octet-stream, Size: 2760 bytes --] From 9532b75d548aee18929396dd3f5b05ffab1f32f5 Mon Sep 17 00:00:00 2001 From: chengbin <bin.cheng@linux.alibaba.com> Date: Thu, 16 Aug 2018 12:09:48 +0800 Subject: [PATCH 2/4] Handle ZERO profile count/prob as a general value for different qualities. Don't invert uninitialized profile probability. --- gcc/expmed.c | 2 +- gcc/profile-count.h | 16 ++++++++++------ 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/gcc/expmed.c b/gcc/expmed.c index 444d6a82f98..6e95f3c76eb 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -6166,7 +6166,7 @@ emit_store_flag_force (rtx target, enum rtx_code code, rtx op0, rtx op1, emit_move_insn (target, trueval); label = gen_label_rtx (); do_compare_rtx_and_jump (op0, op1, code, unsignedp, mode, NULL_RTX, NULL, - label, profile_probability::uninitialized ()); + label, profile_probability::always ()); emit_move_insn (target, falseval); emit_label (label); diff --git a/gcc/profile-count.h b/gcc/profile-count.h index f4d0c340a0a..7003d030d58 100644 --- a/gcc/profile-count.h +++ b/gcc/profile-count.h @@ -200,11 +200,11 @@ public: ret.m_quality = profile_guessed; return ret; } - static profile_probability always () + static profile_probability always (enum profile_quality q = profile_precise) { profile_probability ret; ret.m_val = max_probability; - ret.m_quality = profile_precise; + ret.m_quality = q; return ret; } /* Probabilities which has not been initialized. Either because @@ -284,7 +284,8 @@ public: /* Basic operations. */ bool operator== (const profile_probability &other) const { - return m_val == other.m_val && m_quality == other.m_quality; + return (m_val == other.m_val + && (m_quality == other.m_quality || m_val == 0)); } profile_probability operator+ (const profile_probability &other) const { @@ -459,10 +460,12 @@ public: return RDIV (val * m_val, max_probability); } - /* Return 1-*THIS. */ + /* Return 1-*THIS. It's meaningless to invert an uninitialized value. */ profile_probability invert () const { - return profile_probability::always() - *this; + if (! initialized_p ()) + return *this; + return profile_probability::always(m_quality) - *this; } /* Return THIS with quality dropped to GUESSED. */ @@ -754,7 +757,8 @@ public: /* Basic operations. */ bool operator== (const profile_count &other) const { - return m_val == other.m_val && m_quality == other.m_quality; + return (m_val == other.m_val + && (m_quality == other.m_quality || m_val == 0)); } profile_count operator+ (const profile_count &other) const { -- 2.14.4.44.g2045bb6 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-10-31 8:33 [PATCH AutoFDO/2]Treat ZERO as common profile probability/count bin.cheng @ 2018-10-31 9:43 ` Richard Biener 2018-10-31 9:57 ` Bin.Cheng ` (2 more replies) 2018-10-31 15:02 ` Jeff Law 1 sibling, 3 replies; 15+ messages in thread From: Richard Biener @ 2018-10-31 9:43 UTC (permalink / raw) To: bin.cheng, Jan Hubicka; +Cc: GCC Patches On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.cheng@linux.alibaba.com> wrote: > > Hi, > In new profile probability/count infra, we have different precision quality categories, > and probabilities/counts of different categories are not supposed to be compared or > calculated. Though in general is an improvement, it introduces unexpected behavior. > Specifically, class profile_probablity and profile_count themselves are implemented > by comparing probabilities/counts against profile_count::zero(). while zero() is of > profile_precision category, it's always compared different to zero of other precision > categories including afdo. > > I can see two ways fixing this: 1) Treat zero as a common probability/count regardless > of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison > against probability_count::zero(). 2) requires lots of code changes so I went with 1) > in this patch set. This patch doesn't handle "always" but it might be. > > This patch also corrects a minor issue where we try to invert an uninitialized value. > > Bootstrap and test on x86_64 in patch set. Is it OK? I'll defer on the emit_store_flag_force change, likewise for the zero handling in compares - I don't think zeros of different qualities should compare equal. Would compares against ::always() not have the very same issue? Likewise ::even(), ::likely(), etc.? Those always get guessed quality. The invert change looks OK to me. The related change to the always() API would suggest to replace guessed_always() with always (guessed) and also do similar changes throughout the whole API... Honza? Thanks, Richard. > Thanks, > bin > > 2018-10-31 Bin Cheng <bin.cheng@linux.alibaba.com> > > * expmed.c (emit_store_flag_force): Use profile_probability::always. > * profile-count.h (profile_probability::always): Add parameter. > (profile_probability::operator==, profile_count::operator==): Treat > ZERO as common probability/count regardless of its quality. > (profile_probability::invert): Don't invert uninitialized probability. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-10-31 9:43 ` Richard Biener @ 2018-10-31 9:57 ` Bin.Cheng 2018-11-02 5:31 ` bin.cheng [not found] ` <20181105141206.4ncu3s2v2jxv6o54@kam.mff.cuni.cz> 2 siblings, 0 replies; 15+ messages in thread From: Bin.Cheng @ 2018-10-31 9:57 UTC (permalink / raw) To: Richard Guenther; +Cc: bin.cheng, Jan Hubicka, gcc-patches List On Wed, Oct 31, 2018 at 5:11 PM Richard Biener <richard.guenther@gmail.com> wrote: > > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.cheng@linux.alibaba.com> wrote: > > > > Hi, > > In new profile probability/count infra, we have different precision quality categories, > > and probabilities/counts of different categories are not supposed to be compared or > > calculated. Though in general is an improvement, it introduces unexpected behavior. > > Specifically, class profile_probablity and profile_count themselves are implemented > > by comparing probabilities/counts against profile_count::zero(). while zero() is of > > profile_precision category, it's always compared different to zero of other precision > > categories including afdo. > > > > I can see two ways fixing this: 1) Treat zero as a common probability/count regardless > > of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison > > against probability_count::zero(). 2) requires lots of code changes so I went with 1) > > in this patch set. This patch doesn't handle "always" but it might be. > > > > This patch also corrects a minor issue where we try to invert an uninitialized value. > > > > Bootstrap and test on x86_64 in patch set. Is it OK? > > I'll defer on the emit_store_flag_force change, likewise for the zero > handling in > compares - I don't think zeros of different qualities should compare equal. > Would compares against ::always() not have the very same issue? > Likewise ::even(), > ::likely(), etc.? Those always get guessed quality. Yes, these values also affected if compared with precise category, but zero is the major issue. So 2) makes more sense when checking if a profile count is_zero/is_likely/is_always etc. regardless of its categories. Once with Honza's input, I can do some experiments. Thanks, bin > > The invert change looks OK to me. The related change to the always() API would > suggest to replace guessed_always() with always (guessed) and also do similar > changes throughout the whole API... > > Honza? > > Thanks, > Richard. > > > > Thanks, > > bin > > > > 2018-10-31 Bin Cheng <bin.cheng@linux.alibaba.com> > > > > * expmed.c (emit_store_flag_force): Use profile_probability::always. > > * profile-count.h (profile_probability::always): Add parameter. > > (profile_probability::operator==, profile_count::operator==): Treat > > ZERO as common probability/count regardless of its quality. > > (profile_probability::invert): Don't invert uninitialized probability. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-10-31 9:43 ` Richard Biener 2018-10-31 9:57 ` Bin.Cheng @ 2018-11-02 5:31 ` bin.cheng 2018-11-05 14:38 ` Jan Hubicka 2018-11-05 14:40 ` Jan Hubicka [not found] ` <20181105141206.4ncu3s2v2jxv6o54@kam.mff.cuni.cz> 2 siblings, 2 replies; 15+ messages in thread From: bin.cheng @ 2018-11-02 5:31 UTC (permalink / raw) To: GCC Patches; +Cc: Jan Hubicka, Richard Biener [-- Attachment #1: Type: text/plain, Size: 4953 bytes --] ------------------------------------------------------------------ Sender:Richard Biener <richard.guenther@gmail.com> Sent at:2018 Oct 31 (Wed) 17:11 To:bin.cheng <bin.cheng@linux.alibaba.com>; Jan Hubicka <hubicka@ucw.cz> Cc:GCC Patches <gcc-patches@gcc.gnu.org> Subject:Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count > > > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.cheng@linux.alibaba.com> wrote: >> >> Hi, >> In new profile probability/count infra, we have different precision quality categories, >> and probabilities/counts of different categories are not supposed to be compared or >> calculated. Though in general is an improvement, it introduces unexpected behavior. >> Specifically, class profile_probablity and profile_count themselves are implemented >> by comparing probabilities/counts against profile_count::zero(). while zero() is of >> profile_precision category, it's always compared different to zero of other precision >> categories including afdo. >> >> I can see two ways fixing this: 1) Treat zero as a common probability/count regardless >> of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison >> against probability_count::zero(). 2) requires lots of code changes so I went with 1) >> in this patch set. This patch doesn't handle "always" but it might be. >> >> This patch also corrects a minor issue where we try to invert an uninitialized value. >> >> Bootstrap and test on x86_64 in patch set. Is it OK? > Thanks for all the reviews. > I'll defer on the emit_store_flag_force change, likewise for the zero As Jeff pointed out too, this is discarded now. > handling in > compares - I don't think zeros of different qualities should compare equal. > Would compares against ::always() not have the very same issue? > Likewise ::even(), > ::likely(), etc.? Those always get guessed quality. In this version, I went with the second method which adds never_p/zero_p member function for probability and count. Check on ZERO value won't fail unexpected now. The patch also makes some other changes that avoids short-circuit returning on ZERO probability/count and let. > > The invert change looks OK to me. The related change to the always() API would > suggest to replace guessed_always() with always (guessed) and also do similar > changes throughout the whole API... The invert part is split as a standalone patch. I think adding a default precision parameter to member functions is a little bit better than having various version functions here, like you mentioned for from_gcov_type. Bootstrap and test on x86_64 in patch set. It fixes all (13) ICE autofdo tests. Thanks, bin 2018-11-02 Bin Cheng <bin.cheng@linux.alibaba.com> * profile-count.h (profile_probability::always): Add parameter. (profile_probability::invert): Don't invert uninitialized probability. 2018-11-02 Bin Cheng <bin.cheng@linux.alibaba.com> * profile-count.h (profile_probability::never_p): New. (profile_probability::operator+, +=, -, -=, *, *=, /, /=): Check ZERO probability using never_p. (profile_probability::apply_scale): Likewise. (profile_probability::apply): Check using uninitialized_p. (profile_count::zero_p): New. (profile_count::compatible_p): Check ZERO count using zero_p. (profile_count::operator+, +=, <, >, <=, >=, apply_scale): Likewise. (profile_count::apply_probability, probability_in): Likewise. (profile_count::operator-, -=): Likewise. Don't quick return if RHS profile_count is ZERO. (profile_count::max): Don't quick return in case of ZERO count. * profile-count.c (profile_count::to_frequency): Check ZERO profile probability or count using never_p or zero_p. (profile_count::to_cgraph_frequency, to_sreal_scale): Likewise. (profile_count::adjust_for_ipa_scaling): Likewise. (profile_count::combine_with_ipa_count): Likewise. (profile_probability::combine_with_count): Likewise. * bb-reorder.c (sanitize_hot_paths): Likewise. * cfg.c (update_bb_profile_for_threading): Likewise. (scale_bbs_frequencies_profile_count): Likewise. * ipa-profile.c (ipa_propagate_frequency_1): Likewise. (ipa_propagate_frequency): Likewise. * ipa-utility.c (ipa_merge_profiles): Likewise. * predict.c (maybe_hot_count_p, probably_never_executed): Likewise. (probably_never_executed_bb_p, unlikely_executed_stmt_p): Likewise. (combine_predictions_for_bb): Likewise. (drop_profile, handle_missing_profiles): Likewise. (propagate_unlikely_bbs_forward): Likewise. (determine_unlikely_bbs): Likewise. (estimate_bb_frequencies): Likewise. (compute_function_frequency, force_edge_cold): Likewise. * profile.c (compute_branch_probabilities): Likewise. * shrink-wrap.c (try_shrink_wrapping): Likewise. * tree-ssa-loop-manip.c (scale_dominated_blocks_in_loop): Likewise. (tree_transform_and_unroll_loop): Likewise. * tree-ssa-threadupdate.c (update_profile): Likewise. * tree-vect-loop.c (scale_profile_for_vect_loop): Likewise. [-- Attachment #2: 0002-Dont-invert-uninitialized-value.patch --] [-- Type: application/octet-stream, Size: 1384 bytes --] From 06307f1b86c619b87494c573eb54342a25778d71 Mon Sep 17 00:00:00 2001 From: chengbin <bin.cheng@linux.alibaba.com> Date: Thu, 1 Nov 2018 12:12:44 +0800 Subject: [PATCH 2/5] Dont invert uninitialized value --- gcc/profile-count.h | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/gcc/profile-count.h b/gcc/profile-count.h index f4d0c340a0a..4289bc5a004 100644 --- a/gcc/profile-count.h +++ b/gcc/profile-count.h @@ -200,11 +200,11 @@ public: ret.m_quality = profile_guessed; return ret; } - static profile_probability always () + static profile_probability always (enum profile_quality q = profile_precise) { profile_probability ret; ret.m_val = max_probability; - ret.m_quality = profile_precise; + ret.m_quality = q; return ret; } /* Probabilities which has not been initialized. Either because @@ -459,10 +459,12 @@ public: return RDIV (val * m_val, max_probability); } - /* Return 1-*THIS. */ + /* Return 1-*THIS. It's meaningless to invert an uninitialized value. */ profile_probability invert () const { - return profile_probability::always() - *this; + if (! initialized_p ()) + return *this; + return profile_probability::always (m_quality) - *this; } /* Return THIS with quality dropped to GUESSED. */ -- 2.14.4.44.g2045bb6 [-- Attachment #3: 0003-Check-zero-probability-count-using-member-function.patch --] [-- Type: application/octet-stream, Size: 28481 bytes --] From 5b0e38d146ff8cf618eab144f46601d74c5cd1e5 Mon Sep 17 00:00:00 2001 From: chengbin <bin.cheng@linux.alibaba.com> Date: Thu, 1 Nov 2018 12:10:37 +0800 Subject: [PATCH 3/5] Check zero probability count using member function. --- gcc/bb-reorder.c | 6 +-- gcc/cfg.c | 4 +- gcc/ipa-profile.c | 6 +-- gcc/ipa-utils.c | 6 +-- gcc/predict.c | 64 +++++++++++++++---------------- gcc/profile-count.c | 18 ++++----- gcc/profile-count.h | 92 ++++++++++++++++++++++----------------------- gcc/profile.c | 2 +- gcc/shrink-wrap.c | 2 +- gcc/tree-ssa-loop-manip.c | 4 +- gcc/tree-ssa-threadupdate.c | 2 +- gcc/tree-vect-loop.c | 2 +- 12 files changed, 103 insertions(+), 105 deletions(-) diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c index e20df160723..3350799a1b1 100644 --- a/gcc/bb-reorder.c +++ b/gcc/bb-reorder.c @@ -1572,8 +1572,7 @@ sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count, continue; /* Do not expect profile insanities when profile was not adjusted. */ - if (e->probability == profile_probability::never () - || e->count () == profile_count::zero ()) + if (e->probability.never_p () || (e->count ()).zero_p ()) continue; if (BB_PARTITION (reach_bb) != BB_COLD_PARTITION) @@ -1604,8 +1603,7 @@ sanitize_hot_paths (bool walk_up, unsigned int cold_bb_count, if (e->flags & EDGE_DFS_BACK) continue; /* Do not expect profile insanities when profile was not adjusted. */ - if (e->probability == profile_probability::never () - || e->count () == profile_count::zero ()) + if (e->probability.never_p () || (e->count ()).zero_p ()) continue; /* Select the hottest edge using the edge count, if it is non-zero, then fallback to the edge probability. */ diff --git a/gcc/cfg.c b/gcc/cfg.c index 7be89d40604..1f776444f85 100644 --- a/gcc/cfg.c +++ b/gcc/cfg.c @@ -911,7 +911,7 @@ update_bb_profile_for_threading (basic_block bb, /* Now rescale the probabilities. */ taken_edge->probability -= prob; prob = prob.invert (); - if (prob == profile_probability::never ()) + if (prob.never_p ()) { if (dump_file) fprintf (dump_file, "Edge probabilities of bb %i has been reset, " @@ -940,7 +940,7 @@ scale_bbs_frequencies_profile_count (basic_block *bbs, int nbbs, profile_count num, profile_count den) { int i; - if (num == profile_count::zero () || den.nonzero_p ()) + if (num.zero_p () || den.nonzero_p ()) for (i = 0; i < nbbs; i++) bbs[i]->count = bbs[i]->count.apply_scale (num, den); } diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c index c74f4a4a41d..69e92431323 100644 --- a/gcc/ipa-profile.c +++ b/gcc/ipa-profile.c @@ -324,7 +324,7 @@ ipa_propagate_frequency_1 (struct cgraph_node *node, void *data) it is executed by the train run. Transfer the function only if all callers are unlikely executed. */ if (profile_info - && !(edge->callee->count.ipa () == profile_count::zero ()) + && !((edge->callee->count.ipa ()).zero_p ()) && (edge->caller->frequency != NODE_FREQUENCY_UNLIKELY_EXECUTED || (edge->caller->global.inlined_to && edge->caller->global.inlined_to->frequency @@ -428,8 +428,8 @@ ipa_propagate_frequency (struct cgraph_node *node) if (node->count. ipa().initialized_p ()) { bool hot = false; - if (!(node->count. ipa() == profile_count::zero ()) - && node->count. ipa() >= get_hot_bb_threshold ()) + if (!((node->count.ipa ()).zero_p ()) + && node->count.ipa () >= get_hot_bb_threshold ()) hot = true; if (!hot) hot |= contains_hot_call_p (node); diff --git a/gcc/ipa-utils.c b/gcc/ipa-utils.c index 9985721f7da..d98e22463da 100644 --- a/gcc/ipa-utils.c +++ b/gcc/ipa-utils.c @@ -530,9 +530,9 @@ ipa_merge_profiles (struct cgraph_node *dst, pick more informative one (that is nonzero IPA if other is uninitialized, guessed or global0). */ if (!dstbb->count.ipa ().initialized_p () - || (dstbb->count.ipa () == profile_count::zero () + || ((dstbb->count.ipa ()).zero_p () && (srcbb->count.ipa ().initialized_p () - && !(srcbb->count.ipa () == profile_count::zero ())))) + && !((srcbb->count.ipa ()).zero_p ())))) { dstbb->count = srcbb->count; for (i = 0; i < EDGE_COUNT (srcbb->succs); i++) @@ -544,7 +544,7 @@ ipa_merge_profiles (struct cgraph_node *dst, } } else if (srcbb->count.ipa ().initialized_p () - && !(srcbb->count.ipa () == profile_count::zero ())) + && !((srcbb->count.ipa ()).zero_p ())) { for (i = 0; i < EDGE_COUNT (srcbb->succs); i++) { diff --git a/gcc/predict.c b/gcc/predict.c index ab2dc8ed031..24d7c7b45d2 100644 --- a/gcc/predict.c +++ b/gcc/predict.c @@ -155,7 +155,7 @@ maybe_hot_count_p (struct function *fun, profile_count count) { if (!count.initialized_p ()) return true; - if (count.ipa () == profile_count::zero ()) + if ((count.ipa ()).zero_p ()) return false; if (!count.ipa_p ()) { @@ -212,7 +212,7 @@ probably_never_executed (struct function *fun, profile_count count) { gcc_checking_assert (fun); - if (count.ipa () == profile_count::zero ()) + if ((count.ipa ()).zero_p ()) return true; /* Do not trust adjusted counts. This will make us to drop int cold section code with low execution count as a result of inlining. These low counts @@ -248,8 +248,8 @@ probably_never_executed_bb_p (struct function *fun, const_basic_block bb) static bool unlikely_executed_edge_p (edge e) { - return (e->count () == profile_count::zero () - || e->probability == profile_probability::never ()) + return ((e->count ()).zero_p () + || e->probability.never_p ()) || (e->flags & (EDGE_EH | EDGE_FAKE)); } @@ -810,7 +810,7 @@ unlikely_executed_stmt_p (gimple *stmt) static bool unlikely_executed_bb_p (basic_block bb) { - if (bb->count == profile_count::zero ()) + if (bb->count.zero_p ()) return true; if (bb == ENTRY_BLOCK_PTR_FOR_FN (cfun) || bb == EXIT_BLOCK_PTR_FOR_FN (cfun)) return false; @@ -1187,7 +1187,7 @@ combine_predictions_for_bb (basic_block bb, bool dry_run) e->probability = profile_probability::never (); if (!e->probability.initialized_p ()) nunknown++; - else if (e->probability == profile_probability::never ()) + else if (e->probability.never_p ()) nzero++; } @@ -3452,7 +3452,7 @@ drop_profile (struct cgraph_node *node, profile_count call_count) bool clear_zeros = !ENTRY_BLOCK_PTR_FOR_FN (fn)->count.nonzero_p (); FOR_ALL_BB_FN (bb, fn) - if (clear_zeros || !(bb->count == profile_count::zero ())) + if (clear_zeros || !(bb->count.zero_p ())) bb->count = bb->count.guessed_local (); fn->cfg->count_max = fn->cfg->count_max.guessed_local (); } @@ -3543,7 +3543,7 @@ handle_missing_profiles (void) struct cgraph_node *callee = e->callee; struct function *fn = DECL_STRUCT_FUNCTION (callee->decl); - if (!(e->count.ipa () == profile_count::zero ()) + if (!((e->count.ipa ()).zero_p ()) && callee->count.ipa ().nonzero_p ()) continue; if ((DECL_COMDAT (callee->decl) || DECL_EXTERNAL (callee->decl)) @@ -3627,7 +3627,7 @@ propagate_unlikely_bbs_forward (void) edge_iterator ei; edge e; - if (!(ENTRY_BLOCK_PTR_FOR_FN (cfun)->count == profile_count::zero ())) + if (!(ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.zero_p ())) { ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux = (void *)(size_t) 1; worklist.safe_push (ENTRY_BLOCK_PTR_FOR_FN (cfun)); @@ -3636,8 +3636,8 @@ propagate_unlikely_bbs_forward (void) { bb = worklist.pop (); FOR_EACH_EDGE (e, ei, bb->succs) - if (!(e->count () == profile_count::zero ()) - && !(e->dest->count == profile_count::zero ()) + if (!((e->count ()).zero_p ()) + && !(e->dest->count.zero_p ()) && !e->dest->aux) { e->dest->aux = (void *)(size_t) 1; @@ -3650,7 +3650,7 @@ propagate_unlikely_bbs_forward (void) { if (!bb->aux) { - if (!(bb->count == profile_count::zero ()) + if (!(bb->count.zero_p ()) && (dump_file && (dump_flags & TDF_DETAILS))) fprintf (dump_file, "Basic block %i is marked unlikely by forward prop\n", @@ -3677,7 +3677,7 @@ determine_unlikely_bbs () FOR_EACH_BB_FN (bb, cfun) { - if (!(bb->count == profile_count::zero ()) + if (!(bb->count.zero_p ()) && unlikely_executed_bb_p (bb)) { if (dump_file && (dump_flags & TDF_DETAILS)) @@ -3687,7 +3687,7 @@ determine_unlikely_bbs () } FOR_EACH_EDGE (e, ei, bb->succs) - if (!(e->probability == profile_probability::never ()) + if (!(e->probability.never_p ()) && unlikely_executed_edge_p (e)) { if (dump_file && (dump_flags & TDF_DETAILS)) @@ -3703,13 +3703,13 @@ determine_unlikely_bbs () auto_vec<int, 64> nsuccs; nsuccs.safe_grow_cleared (last_basic_block_for_fn (cfun)); FOR_ALL_BB_FN (bb, cfun) - if (!(bb->count == profile_count::zero ()) + if (!(bb->count.zero_p ()) && bb != EXIT_BLOCK_PTR_FOR_FN (cfun)) { nsuccs[bb->index] = 0; FOR_EACH_EDGE (e, ei, bb->succs) - if (!(e->probability == profile_probability::never ()) - && !(e->dest->count == profile_count::zero ())) + if (!(e->probability.never_p ()) + && !(e->dest->count.zero_p ())) nsuccs[bb->index]++; if (!nsuccs[bb->index]) worklist.safe_push (bb); @@ -3717,7 +3717,7 @@ determine_unlikely_bbs () while (worklist.length () > 0) { bb = worklist.pop (); - if (bb->count == profile_count::zero ()) + if (bb->count.zero_p ()) continue; if (bb != ENTRY_BLOCK_PTR_FOR_FN (cfun)) { @@ -3743,9 +3743,9 @@ determine_unlikely_bbs () bb->index); bb->count = profile_count::zero (); FOR_EACH_EDGE (e, ei, bb->preds) - if (!(e->probability == profile_probability::never ())) + if (!(e->probability.never_p ())) { - if (!(e->src->count == profile_count::zero ())) + if (!(e->src->count.zero_p ())) { gcc_checking_assert (nsuccs[e->src->index] > 0); nsuccs[e->src->index]--; @@ -3756,10 +3756,10 @@ determine_unlikely_bbs () } /* Finally all edges from non-0 regions to 0 are unlikely. */ FOR_ALL_BB_FN (bb, cfun) - if (!(bb->count == profile_count::zero ())) + if (!(bb->count.zero_p ())) FOR_EACH_EDGE (e, ei, bb->succs) - if (!(e->probability == profile_probability::never ()) - && e->dest->count == profile_count::zero ()) + if (!(e->probability.never_p ()) + && e->dest->count.zero_p ()) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "Edge %i->%i is unlikely because " @@ -3767,7 +3767,7 @@ determine_unlikely_bbs () bb->index, e->dest->index); e->probability = profile_probability::never (); } - if (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count == profile_count::zero ()) + if (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.zero_p ()) cgraph_node::get (current_function_decl)->count = profile_count::zero (); } @@ -3849,7 +3849,7 @@ estimate_bb_frequencies (bool force) /* If we have profile feedback in which this function was never executed, then preserve this info. */ - if (!(bb->count == profile_count::zero ())) + if (!(bb->count.zero_p ())) bb->count = count.guessed_local ().combine_with_ipa_count (ipa_count); cfun->cfg->count_max = cfun->cfg->count_max.max (bb->count); } @@ -3877,7 +3877,7 @@ compute_function_frequency (void) { int flags = flags_from_decl_or_type (current_function_decl); if ((ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa_p () - && ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa() == profile_count::zero ()) + && (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa()).zero_p ()) || lookup_attribute ("cold", DECL_ATTRIBUTES (current_function_decl)) != NULL) { @@ -3899,7 +3899,7 @@ compute_function_frequency (void) node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED; warn_function_cold (current_function_decl); - if (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa() == profile_count::zero ()) + if ((ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.ipa()).zero_p ()) return; FOR_EACH_BB_FN (bb, cfun) { @@ -4242,7 +4242,7 @@ force_edge_cold (edge e, bool impossible) /* If edge is already improbably or cold, just return. */ if (e->probability <= goal - && (!impossible || e->count () == profile_count::zero ())) + && (!impossible || (e->count ()).zero_p ())) return; FOR_EACH_EDGE (e2, ei, e->src->succs) if (e2 != e) @@ -4288,7 +4288,7 @@ force_edge_cold (edge e, bool impossible) is unlikely. */ else { - if (prob_sum == profile_probability::never ()) + if (prob_sum.never_p ()) e->probability = profile_probability::always (); else { @@ -4301,9 +4301,9 @@ force_edge_cold (edge e, bool impossible) if (current_ir_type () != IR_GIMPLE && e->src != ENTRY_BLOCK_PTR_FOR_FN (cfun)) update_br_prob_note (e->src); - if (e->src->count == profile_count::zero ()) + if (e->src->count.zero_p ()) return; - if (count_sum == profile_count::zero () && impossible) + if (count_sum.zero_p () && impossible) { bool found = false; if (e->src == ENTRY_BLOCK_PTR_FOR_FN (cfun)) @@ -4340,7 +4340,7 @@ force_edge_cold (edge e, bool impossible) BB has only one predecestor. This is common case when we are updating after loop transforms. */ if (!(prob_sum > profile_probability::never ()) - && count_sum == profile_count::zero () + && count_sum.zero_p () && single_pred_p (e->src) && e->src->count.to_frequency (cfun) > (impossible ? 0 : 1)) { diff --git a/gcc/profile-count.c b/gcc/profile-count.c index f4ab244e3a3..0e6109cd7b8 100644 --- a/gcc/profile-count.c +++ b/gcc/profile-count.c @@ -258,7 +258,7 @@ profile_count::to_frequency (struct function *fun) const { if (!initialized_p ()) return BB_FREQ_MAX; - if (*this == profile_count::zero ()) + if (this->zero_p ()) return 0; gcc_assert (REG_BR_PROB_BASE == BB_FREQ_MAX && fun->cfg->count_max.initialized_p ()); @@ -277,7 +277,7 @@ profile_count::to_cgraph_frequency (profile_count entry_bb_count) const { if (!initialized_p () || !entry_bb_count.initialized_p ()) return CGRAPH_FREQ_BASE; - if (*this == profile_count::zero ()) + if (this->zero_p ()) return 0; gcc_checking_assert (entry_bb_count.initialized_p ()); uint64_t scale; @@ -300,7 +300,7 @@ profile_count::to_sreal_scale (profile_count in, bool *known) const } if (known) *known = true; - if (*this == profile_count::zero ()) + if (this->zero_p ()) return 0; if (!in.m_val) @@ -327,7 +327,7 @@ profile_count::adjust_for_ipa_scaling (profile_count *num, if (*num == *den) return; /* Scaling to zero is always zero. */ - if (*num == profile_count::zero ()) + if (num->zero_p ()) return; /* If den is non-zero we are safe. */ if (den->force_nonzero () == *den) @@ -349,9 +349,9 @@ profile_count::combine_with_ipa_count (profile_count ipa) ipa = ipa.ipa (); if (ipa.nonzero_p ()) return ipa; - if (!ipa.initialized_p () || *this == profile_count::zero ()) + if (!ipa.initialized_p () || this->zero_p ()) return *this; - if (ipa == profile_count::zero ()) + if (ipa.zero_p ()) return this->global0 (); return this->global0adjusted (); } @@ -387,10 +387,10 @@ profile_probability::combine_with_count (profile_count count1, If counts are nonzero we can distribute accordingly. In remaining cases just avreage the values and hope for the best. */ if (*this == other || count1 == count2 - || (count2 == profile_count::zero () - && !(count1 == profile_count::zero ()))) + || (count2.zero_p () + && !(count1.zero_p ()))) return *this; - if (count1 == profile_count::zero () && !(count2 == profile_count::zero ())) + if (count1.zero_p () && !(count2.zero_p ())) return other; else if (count1.nonzero_p () || count2.nonzero_p ()) return *this * count1.probability_in (count1 + count2) diff --git a/gcc/profile-count.h b/gcc/profile-count.h index 4289bc5a004..2b5e3269250 100644 --- a/gcc/profile-count.h +++ b/gcc/profile-count.h @@ -218,6 +218,11 @@ public: } + /* Return true if value is zero. */ + bool never_p () const + { + return m_val == 0; + } /* Return true if value has been initialized. */ bool initialized_p () const { @@ -288,9 +293,9 @@ public: } profile_probability operator+ (const profile_probability &other) const { - if (other == profile_probability::never ()) + if (other.never_p ()) return *this; - if (*this == profile_probability::never ()) + if (this->never_p ()) return other; if (!initialized_p () || !other.initialized_p ()) return profile_probability::uninitialized (); @@ -302,9 +307,9 @@ public: } profile_probability &operator+= (const profile_probability &other) { - if (other == profile_probability::never ()) + if (other.never_p ()) return *this; - if (*this == profile_probability::never ()) + if (this->never_p ()) { *this = other; return *this; @@ -320,8 +325,7 @@ public: } profile_probability operator- (const profile_probability &other) const { - if (*this == profile_probability::never () - || other == profile_probability::never ()) + if (this->never_p ()) return *this; if (!initialized_p () || !other.initialized_p ()) return profile_probability::uninitialized (); @@ -332,8 +336,7 @@ public: } profile_probability &operator-= (const profile_probability &other) { - if (*this == profile_probability::never () - || other == profile_probability::never ()) + if (this->never_p ()) return *this; if (!initialized_p () || !other.initialized_p ()) return *this = profile_probability::uninitialized (); @@ -346,8 +349,7 @@ public: } profile_probability operator* (const profile_probability &other) const { - if (*this == profile_probability::never () - || other == profile_probability::never ()) + if (this->never_p () || other.never_p ()) return profile_probability::never (); if (!initialized_p () || !other.initialized_p ()) return profile_probability::uninitialized (); @@ -358,8 +360,7 @@ public: } profile_probability &operator*= (const profile_probability &other) { - if (*this == profile_probability::never () - || other == profile_probability::never ()) + if (this->never_p () || other.never_p ()) return *this = profile_probability::never (); if (!initialized_p () || !other.initialized_p ()) return *this = profile_probability::uninitialized (); @@ -372,7 +373,7 @@ public: } profile_probability operator/ (const profile_probability &other) const { - if (*this == profile_probability::never ()) + if (this->never_p ()) return profile_probability::never (); if (!initialized_p () || !other.initialized_p ()) return profile_probability::uninitialized (); @@ -399,7 +400,7 @@ public: } profile_probability &operator/= (const profile_probability &other) { - if (*this == profile_probability::never ()) + if (this->never_p ()) return *this = profile_probability::never (); if (!initialized_p () || !other.initialized_p ()) return *this = profile_probability::uninitialized (); @@ -454,7 +455,7 @@ public: gcov_type apply (gcov_type val) const { - if (*this == profile_probability::uninitialized ()) + if (! initialized_p ()) return val / 2; return RDIV (val * m_val, max_probability); } @@ -486,7 +487,7 @@ public: /* Return *THIS * NUM / DEN. */ profile_probability apply_scale (int64_t num, int64_t den) const { - if (*this == profile_probability::never ()) + if (this->never_p ()) return *this; if (!initialized_p ()) return profile_probability::uninitialized (); @@ -657,8 +658,7 @@ private: { if (!initialized_p () || !other.initialized_p ()) return true; - if (*this == profile_count::zero () - || other == profile_count::zero ()) + if (this->zero_p () || other.zero_p ()) return true; return ipa_p () == other.ipa_p (); } @@ -704,6 +704,11 @@ public: return m_val; } + /* Return true if value is zero. */ + bool zero_p () const + { + return m_val == 0; + } /* Return true if value has been initialized. */ bool initialized_p () const { @@ -760,9 +765,9 @@ public: } profile_count operator+ (const profile_count &other) const { - if (other == profile_count::zero ()) + if (other.zero_p ()) return *this; - if (*this == profile_count::zero ()) + if (this->zero_p ()) return other; if (!initialized_p () || !other.initialized_p ()) return profile_count::uninitialized (); @@ -775,9 +780,9 @@ public: } profile_count &operator+= (const profile_count &other) { - if (other == profile_count::zero ()) + if (other.zero_p ()) return *this; - if (*this == profile_count::zero ()) + if (this->zero_p ()) { *this = other; return *this; @@ -794,7 +799,7 @@ public: } profile_count operator- (const profile_count &other) const { - if (*this == profile_count::zero () || other == profile_count::zero ()) + if (this->zero_p ()) return *this; if (!initialized_p () || !other.initialized_p ()) return profile_count::uninitialized (); @@ -806,7 +811,7 @@ public: } profile_count &operator-= (const profile_count &other) { - if (*this == profile_count::zero () || other == profile_count::zero ()) + if (this->zero_p ()) return *this; if (!initialized_p () || !other.initialized_p ()) return *this = profile_count::uninitialized (); @@ -832,9 +837,9 @@ public: { if (!initialized_p () || !other.initialized_p ()) return false; - if (*this == profile_count::zero ()) - return !(other == profile_count::zero ()); - if (other == profile_count::zero ()) + if (this->zero_p ()) + return !(other.zero_p ()); + if (other.zero_p ()) return false; gcc_checking_assert (compatible_p (other)); return m_val < other.m_val; @@ -843,10 +848,10 @@ public: { if (!initialized_p () || !other.initialized_p ()) return false; - if (*this == profile_count::zero ()) + if (this->zero_p ()) return false; - if (other == profile_count::zero ()) - return !(*this == profile_count::zero ()); + if (other.zero_p ()) + return !(this->zero_p ()); gcc_checking_assert (compatible_p (other)); return initialized_p () && other.initialized_p () && m_val > other.m_val; } @@ -867,10 +872,10 @@ public: { if (!initialized_p () || !other.initialized_p ()) return false; - if (*this == profile_count::zero ()) + if (this->zero_p ()) return true; - if (other == profile_count::zero ()) - return (*this == profile_count::zero ()); + if (other.zero_p ()) + return (this->zero_p ()); gcc_checking_assert (compatible_p (other)); return m_val <= other.m_val; } @@ -878,10 +883,10 @@ public: { if (!initialized_p () || !other.initialized_p ()) return false; - if (other == profile_count::zero ()) + if (other.zero_p ()) return true; - if (*this == profile_count::zero ()) - return !(other == profile_count::zero ()); + if (this->zero_p ()) + return !(other.zero_p ()); gcc_checking_assert (compatible_p (other)); return m_val >= other.m_val; } @@ -925,10 +930,6 @@ public: return other; if (!other.initialized_p ()) return *this; - if (*this == profile_count::zero ()) - return other; - if (other == profile_count::zero ()) - return *this; gcc_checking_assert (compatible_p (other)); if (m_val < other.m_val || (m_val == other.m_val && m_quality < other.m_quality)) @@ -954,9 +955,9 @@ public: /* Scale counter according to PROB. */ profile_count apply_probability (profile_probability prob) const { - if (*this == profile_count::zero ()) + if (this->zero_p ()) return *this; - if (prob == profile_probability::never ()) + if (prob.never_p ()) return profile_count::zero (); if (!initialized_p ()) return profile_count::uninitialized (); @@ -986,9 +987,9 @@ public: } profile_count apply_scale (profile_count num, profile_count den) const { - if (*this == profile_count::zero ()) + if (this->zero_p ()) return *this; - if (num == profile_count::zero ()) + if (num.zero_p ()) return num; if (!initialized_p () || !num.initialized_p () || !den.initialized_p ()) return profile_count::uninitialized (); @@ -1071,8 +1072,7 @@ public: OVERALL. */ profile_probability probability_in (const profile_count overall) const { - if (*this == profile_count::zero () - && !(overall == profile_count::zero ())) + if (this->zero_p () && !(overall.zero_p ())) return profile_probability::never (); if (!initialized_p () || !overall.initialized_p () || !overall.m_val) diff --git a/gcc/profile.c b/gcc/profile.c index 2130319b081..70c6c59578d 100644 --- a/gcc/profile.c +++ b/gcc/profile.c @@ -707,7 +707,7 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum) determined zero counts. */ else FOR_ALL_BB_FN (bb, cfun) - if (!(bb->count == profile_count::zero ())) + if (!(bb->count.zero_p ())) bb->count = bb->count.global0 (); bb_gcov_counts.release (); diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c index 1ad73798747..88021f11c74 100644 --- a/gcc/shrink-wrap.c +++ b/gcc/shrink-wrap.c @@ -922,7 +922,7 @@ try_shrink_wrapping (edge *entry_edge, rtx_insn *prologue_seq) if (dump_file) fprintf (dump_file, "Duplicated %d to %d\n", bb->index, dup->index); - if (num == profile_count::zero () || den.nonzero_p ()) + if (num.zero_p () || den.nonzero_p ()) bb->count = bb->count.apply_scale (num, den); dup->count -= bb->count; } diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c index 5acee6c98f3..6b6388fc85e 100644 --- a/gcc/tree-ssa-loop-manip.c +++ b/gcc/tree-ssa-loop-manip.c @@ -1095,7 +1095,7 @@ scale_dominated_blocks_in_loop (struct loop *loop, basic_block bb, { basic_block son; - if (!den.nonzero_p () && !(num == profile_count::zero ())) + if (!den.nonzero_p () && !(num.zero_p ())) return; for (son = first_dom_son (CDI_DOMINATORS, bb); @@ -1378,7 +1378,7 @@ tree_transform_and_unroll_loop (struct loop *loop, unsigned factor, { /* Avoid dropping loop body profile counter to 0 because of zero count in loop's preheader. */ - if (freq_h.nonzero_p () && !(freq_e == profile_count::zero ())) + if (freq_h.nonzero_p () && !(freq_e.zero_p ())) freq_e = freq_e.force_nonzero (); scale_loop_frequencies (loop, freq_e.probability_in (freq_h)); } diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index 8080dff76d0..77b822c23e7 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -890,7 +890,7 @@ update_profile (edge epath, edge edup, profile_count path_in_count, dup_block->count = path_in_count; } - if (path_in_count == profile_count::zero ()) + if (path_in_count.zero_p ()) return; profile_count final_count = epath->count () - path_out_count; diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 177b284e9c6..2a47b20178d 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -8084,7 +8084,7 @@ scale_profile_for_vect_loop (struct loop *loop, unsigned vf) /* Avoid dropping loop body profile counter to 0 because of zero count in loop's preheader. */ - if (!(freq_e == profile_count::zero ())) + if (!(freq_e.zero_p ())) freq_e = freq_e.force_nonzero (); p = freq_e.apply_scale (new_est_niter + 1, 1).probability_in (freq_h); scale_loop_frequencies (loop, p); -- 2.14.4.44.g2045bb6 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-11-02 5:31 ` bin.cheng @ 2018-11-05 14:38 ` Jan Hubicka 2018-11-05 14:40 ` Jan Hubicka 1 sibling, 0 replies; 15+ messages in thread From: Jan Hubicka @ 2018-11-05 14:38 UTC (permalink / raw) To: bin.cheng; +Cc: GCC Patches, Richard Biener diff --git a/gcc/profile-count.h b/gcc/profile-count.h index f4d0c340a0a..4289bc5a004 100644 --- a/gcc/profile-count.h +++ b/gcc/profile-count.h @@ -200,11 +200,11 @@ public: ret.m_quality = profile_guessed; return ret; } - static profile_probability always () + static profile_probability always (enum profile_quality q = profile_precise) There are functions to convert value into given precision. If you wnat to guess that something is always taken, you write profile_probability::always().guessed () So for autofdo we only need to add .atofdo() conversion method. @@ -459,10 +459,12 @@ public: return RDIV (val * m_val, max_probability); } - /* Return 1-*THIS. */ + /* Return 1-*THIS. It's meaningless to invert an uninitialized value. */ profile_probability invert () const { - return profile_probability::always() - *this; + if (! initialized_p ()) + return *this; + return profile_probability::always (m_quality) - *this; How this changes the behaviour? If THIS is uninitialied profile_probability::alwyas() will return uninitialized which seem to make sense. If you have value of some quality it will merge the qualitis and will return value in corresponding m_quality. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-11-02 5:31 ` bin.cheng 2018-11-05 14:38 ` Jan Hubicka @ 2018-11-05 14:40 ` Jan Hubicka 2018-11-13 6:58 ` Bin.Cheng 1 sibling, 1 reply; 15+ messages in thread From: Jan Hubicka @ 2018-11-05 14:40 UTC (permalink / raw) To: bin.cheng; +Cc: GCC Patches, Richard Biener diff --git a/gcc/profile-count.h b/gcc/profile-count.h index 4289bc5a004..2b5e3269250 100644 --- a/gcc/profile-count.h +++ b/gcc/profile-count.h @@ -218,6 +218,11 @@ public: } + /* Return true if value is zero. */ + bool never_p () const + { + return m_val == 0; + } /* Return true if value has been initialized. */ bool initialized_p () const { @@ -288,9 +293,9 @@ public: } profile_probability operator+ (const profile_probability &other) const { - if (other == profile_probability::never ()) + if (other.never_p ()) return *this; - if (*this == profile_probability::never ()) + if (this->never_p ()) This is not correct change. If you add guessed 0 to precise 0, the result needs to be guessed 0 because we are no longer sure the code will not get executed. This is why all the checks here go explicitly to profile_probability::never. Honza ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-11-05 14:40 ` Jan Hubicka @ 2018-11-13 6:58 ` Bin.Cheng 0 siblings, 0 replies; 15+ messages in thread From: Bin.Cheng @ 2018-11-13 6:58 UTC (permalink / raw) To: Jan Hubicka; +Cc: bin.cheng, gcc-patches List, Richard Guenther On Mon, Nov 5, 2018 at 10:40 PM Jan Hubicka <hubicka@ucw.cz> wrote: > > diff --git a/gcc/profile-count.h b/gcc/profile-count.h > index 4289bc5a004..2b5e3269250 100644 > --- a/gcc/profile-count.h > +++ b/gcc/profile-count.h > @@ -218,6 +218,11 @@ public: > } > > > + /* Return true if value is zero. */ > + bool never_p () const > + { > + return m_val == 0; > + } > /* Return true if value has been initialized. */ > bool initialized_p () const > { > @@ -288,9 +293,9 @@ public: > } > profile_probability operator+ (const profile_probability &other) const > { > - if (other == profile_probability::never ()) > + if (other.never_p ()) > return *this; > - if (*this == profile_probability::never ()) > + if (this->never_p ()) > > This is not correct change. If you add guessed 0 to precise 0, > the result needs to be guessed 0 because we are no longer sure the code > will not get executed. This is why all the checks here go explicitly > to profile_probability::never. Hmm, so precise 0 means the code can never get executed? I also noticed that in predict.c there are lots of direct assignment of profile_count::zero as: propagation_unlikely_bbs_forward (void) { //... bb->count = profile_count::zero (); //... } This generally promote profile_count::zero from lower precision to precise precision, but function name/comment seems targeting unlikely executed code, rather than never executed. Is this inconsistent? Thanks, bin > > Honza ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <20181105141206.4ncu3s2v2jxv6o54@kam.mff.cuni.cz>]
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count [not found] ` <20181105141206.4ncu3s2v2jxv6o54@kam.mff.cuni.cz> @ 2018-11-20 10:54 ` bin.cheng [not found] ` <CAHFci28CQB3KK+Yp7gb8BR61UaGhAJJ-R1yzZPHxitczvgEB3w@mail.gmail.com> 0 siblings, 1 reply; 15+ messages in thread From: bin.cheng @ 2018-11-20 10:54 UTC (permalink / raw) To: Richard Biener, Jan Hubicka; +Cc: GCC Patches [-- Attachment #1: Type: text/plain, Size: 5101 bytes --] Sender:Jan Hubicka <hubicka@ucw.cz> Sent at:2018 Nov 5 (Mon) 22:21 To:Richard Biener <richard.guenther@gmail.com> Cc:bin.cheng <bin.cheng@linux.alibaba.com>; GCC Patches <gcc-patches@gcc.gnu.org> Subject:Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count > > > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.cheng@linux.alibaba.com> wrote: > > > > > > Hi, > > > In new profile probability/count infra, we have different precision quality categories, > > > and probabilities/counts of different categories are not supposed to be compared or > > > calculated. Though in general is an improvement, it introduces unexpected behavior. > > > Specifically, class profile_probablity and profile_count themselves are implemented > > > by comparing probabilities/counts against profile_count::zero(). while zero() is of > > > profile_precision category, it's always compared different to zero of other precision > > > categories including afdo. > > > > > > I can see two ways fixing this: 1) Treat zero as a common probability/count regardless > > > of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison > > > against probability_count::zero(). 2) requires lots of code changes so I went with 1) > > > in this patch set. This patch doesn't handle "always" but it might be. > > > > > > This patch also corrects a minor issue where we try to invert an uninitialized value. > > > > > > Bootstrap and test on x86_64 in patch set. Is it OK? > > > > I'll defer on the emit_store_flag_force change, likewise for the zero > > handling in > > compares - I don't think zeros of different qualities should compare equal. > > Would compares against ::always() not have the very same issue? > > Likewise ::even(), > > ::likely(), etc.? Those always get guessed quality. > > > > The invert change looks OK to me. The related change to the always() API would > > suggest to replace guessed_always() with always (guessed) and also do similar > > changes throughout the whole API... > > > > Honza? > > The zeros are really differenct zeros. profile_count::zero makes us to > drop the basic block into cold section because we know that it won't be > executed in normal run of program (either we have accurate profile > feedback or by proving that the program is on way to crash or user > annotated cold section). Having guessed zero or auto-fdo zero won't > make us to do such agressive size optimization. > This is important since those zeros relatively commonly happens by > accident and thus if we dropped all the code to cold section the cold > section would be visited relativel often during execution of program > which would eliminate its need. > > Most comparsion in profile-count.h which goes agains profile_count==zero > are realy intended to pass only for this "aboslute zero". They bypass > the precision adjusmtents which normally happen when you merge values > of different precision. > > What kind of unexpected behaviour are you seeing? > We already have nonzero_p which is what we use when we want to know that > count is non-zero in some sense of precision. Hi Honza, Sorry for letting this slip away. So in case of AutoFDO, due to the nature of sampling, lots of funcs/bbs are annotated with zero profile_count in afdo precision, and we have checks against zero profile_count in precise precision All these checks end up with false and cause issues. Take the code in update_profiling_info as an example: update_profiling_info (struct cgraph_node *orig_node, struct cgraph_node *new_node) { struct cgraph_edge *cs; struct caller_statistics stats; profile_count new_sum, orig_sum; profile_count remainder, orig_node_count = orig_node->count; if (!(orig_node_count.ipa () > profile_count::zero ())) return; //... for (cs = new_node->callees; cs; cs = cs->next_callee) cs->count = cs->count.apply_scale (new_sum, orig_node_count); Since we also have below code in profile_count::operator>, if (other == profile_count::zero ()) return !(*this == profile_count::zero ()); If orig_node_count is afdo zero, the above zero check for orig_node_count returns false, we end up with passing zero density to apply_scale issue and asserting. In this updated patch, I restrcited changes only to profile_count::operator <, >, <= and >=. Plus, I think there is a latent typo in operator>= because current code return TRUE if '*this' is precise zero and 'other' is precise non-zero. @@ -879,7 +879,7 @@ public: if (other == profile_count::zero ()) return true; if (*this == profile_count::zero ()) - return !(other == profile_count::zero ()); + return !other.nonzero_p (); Bootstrap and test on x86_64 along with other patches. Thanks, bin 2018-11-19 Bin Cheng <bin.cheng@linux.alibaba.com> * profile-count.h (profile_count::operator<, >, <=): Check ZERO count using nonzero_p. (profile_count::oeprator>=): Invert return condition when *this is precise zero. Check ZERO count in that condition using nonzero_p. [-- Attachment #2: 0003-Check-ZERO-profile-count-regardless-of-precision.patch --] [-- Type: application/octet-stream, Size: 1758 bytes --] From 6c073ff0cd04abbbe8483d1d17e2491804cbf54f Mon Sep 17 00:00:00 2001 From: chengbin <bin.cheng@linux.alibaba.com> Date: Tue, 13 Nov 2018 20:26:15 +0800 Subject: [PATCH 3/4] Check ZERO profile count regardless of precision. --- gcc/profile-count.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/profile-count.h b/gcc/profile-count.h index f4d0c340a0a..6a79fced633 100644 --- a/gcc/profile-count.h +++ b/gcc/profile-count.h @@ -831,7 +831,7 @@ public: if (!initialized_p () || !other.initialized_p ()) return false; if (*this == profile_count::zero ()) - return !(other == profile_count::zero ()); + return other.nonzero_p (); if (other == profile_count::zero ()) return false; gcc_checking_assert (compatible_p (other)); @@ -844,7 +844,7 @@ public: if (*this == profile_count::zero ()) return false; if (other == profile_count::zero ()) - return !(*this == profile_count::zero ()); + return this->nonzero_p (); gcc_checking_assert (compatible_p (other)); return initialized_p () && other.initialized_p () && m_val > other.m_val; } @@ -868,7 +868,7 @@ public: if (*this == profile_count::zero ()) return true; if (other == profile_count::zero ()) - return (*this == profile_count::zero ()); + return !this->nonzero_p (); gcc_checking_assert (compatible_p (other)); return m_val <= other.m_val; } @@ -879,7 +879,7 @@ public: if (other == profile_count::zero ()) return true; if (*this == profile_count::zero ()) - return !(other == profile_count::zero ()); + return !other.nonzero_p (); gcc_checking_assert (compatible_p (other)); return m_val >= other.m_val; } -- 2.14.4.44.g2045bb6 ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <CAHFci28CQB3KK+Yp7gb8BR61UaGhAJJ-R1yzZPHxitczvgEB3w@mail.gmail.com>]
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count [not found] ` <CAHFci28CQB3KK+Yp7gb8BR61UaGhAJJ-R1yzZPHxitczvgEB3w@mail.gmail.com> @ 2018-11-28 16:20 ` Jan Hubicka 2018-12-04 8:40 ` Bin.Cheng 0 siblings, 1 reply; 15+ messages in thread From: Jan Hubicka @ 2018-11-28 16:20 UTC (permalink / raw) To: Bin.Cheng; +Cc: bin.cheng, Richard Guenther, gcc-patches List > On Tue, Nov 20, 2018 at 6:55 PM bin.cheng <bin.cheng@linux.alibaba.com> wrote: > > > > Sender:Jan Hubicka <hubicka@ucw.cz> > > Sent at:2018 Nov 5 (Mon) 22:21 > > To:Richard Biener <richard.guenther@gmail.com> > > Cc:bin.cheng <bin.cheng@linux.alibaba.com>; GCC Patches <gcc-patches@gcc.gnu.org> > > Subject:Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count > > > > > > > > > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.cheng@linux.alibaba.com> wrote: > > > > > > > > > > Hi, > > > > > In new profile probability/count infra, we have different precision quality categories, > > > > > and probabilities/counts of different categories are not supposed to be compared or > > > > > calculated. Though in general is an improvement, it introduces unexpected behavior. > > > > > Specifically, class profile_probablity and profile_count themselves are implemented > > > > > by comparing probabilities/counts against profile_count::zero(). while zero() is of > > > > > profile_precision category, it's always compared different to zero of other precision > > > > > categories including afdo. > > > > > > > > > > I can see two ways fixing this: 1) Treat zero as a common probability/count regardless > > > > > of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison > > > > > against probability_count::zero(). 2) requires lots of code changes so I went with 1) > > > > > in this patch set. This patch doesn't handle "always" but it might be. > > > > > > > > > > This patch also corrects a minor issue where we try to invert an uninitialized value. > > > > > > > > > > Bootstrap and test on x86_64 in patch set. Is it OK? > > > > > > > > I'll defer on the emit_store_flag_force change, likewise for the zero > > > > handling in > > > > compares - I don't think zeros of different qualities should compare equal. > > > > Would compares against ::always() not have the very same issue? > > > > Likewise ::even(), > > > > ::likely(), etc.? Those always get guessed quality. > > > > > > > > The invert change looks OK to me. The related change to the always() API would > > > > suggest to replace guessed_always() with always (guessed) and also do similar > > > > changes throughout the whole API... > > > > > > > > Honza? > > > > > > The zeros are really differenct zeros. profile_count::zero makes us to > > > drop the basic block into cold section because we know that it won't be > > > executed in normal run of program (either we have accurate profile > > > feedback or by proving that the program is on way to crash or user > > > annotated cold section). Having guessed zero or auto-fdo zero won't > > > make us to do such agressive size optimization. > > > This is important since those zeros relatively commonly happens by > > > accident and thus if we dropped all the code to cold section the cold > > > section would be visited relativel often during execution of program > > > which would eliminate its need. > > > > > > Most comparsion in profile-count.h which goes agains profile_count==zero > > > are realy intended to pass only for this "aboslute zero". They bypass > > > the precision adjusmtents which normally happen when you merge values > > > of different precision. > > > > > > What kind of unexpected behaviour are you seeing? > > > We already have nonzero_p which is what we use when we want to know that > > > count is non-zero in some sense of precision. > > Hi Honza, > > Sorry for letting this slip away. So in case of AutoFDO, due to the nature > > of sampling, lots of funcs/bbs are annotated with zero profile_count in afdo > > precision, and we have checks against zero profile_count in precise precision > > All these checks end up with false and cause issues. Take the code in > > update_profiling_info as an example: > > > > update_profiling_info (struct cgraph_node *orig_node, > > struct cgraph_node *new_node) > > { > > struct cgraph_edge *cs; > > struct caller_statistics stats; > > profile_count new_sum, orig_sum; > > profile_count remainder, orig_node_count = orig_node->count; > > > > if (!(orig_node_count.ipa () > profile_count::zero ())) > > return; > > //... > > for (cs = new_node->callees; cs; cs = cs->next_callee) > > cs->count = cs->count.apply_scale (new_sum, orig_node_count); > > > > Since we also have below code in profile_count::operator>, > > if (other == profile_count::zero ()) > > return !(*this == profile_count::zero ()); > > > > If orig_node_count is afdo zero, the above zero check for orig_node_count > > returns false, we end up with passing zero density to apply_scale issue and > > asserting. > > > > In this updated patch, I restrcited changes only to profile_count::operator > > <, >, <= and >=. Plus, I think there is a latent typo in operator>= because > > current code return TRUE if '*this' is precise zero and 'other' is precise > > non-zero. > > @@ -879,7 +879,7 @@ public: > > if (other == profile_count::zero ()) > > return true; > > if (*this == profile_count::zero ()) > > - return !(other == profile_count::zero ()); > > + return !other.nonzero_p (); We already have True: profile_count::zero < any other value any other value > profile_count::zero profile_count::zero <= any initialized value profile_count::zero <= profile_count::zero any initialized value >= profile_count::zero false profile_count::zero > any other value any other value < profile_count::zero You are right about typo in >=, it should be: Index: profile-count.h =================================================================== --- profile-count.h (revision 266450) +++ profile-count.h (working copy) @@ -879,7 +879,7 @@ if (other == profile_count::zero ()) return true; if (*this == profile_count::zero ()) - return !(other == profile_count::zero ()); + return other == profile_count::zero (); gcc_checking_assert (compatible_p (other)); return m_val >= other.m_val; } With your patch we get false for: profile_count::zero < guessed/auto_fdo/other 0 guessed/auto_fdo/other > profile_count::zero guessed/auto_fdo/other <= profile_count::zero profile_count::zero >= profile_count::zero The original idea was to intentionally make profile_count::zero smaller than any toher types of initialized values, since it is more strict hint that the path will not be taken. For example in bb_reorder if you end up with "funny" profile with two exit edges one having profile_count::zero and other being zero as result of (unsucesfull) profile updates it is still better idea to pick the profile_count::zero for taken edge. With your patch it will end up picking either of the paths. How the patch helps to your situation? The fix for >= is OK, thanks for spotting that! Honza > > > > Bootstrap and test on x86_64 along with other patches. > Ping. > > Thanks, > bin > > > > Thanks, > > bin > > > > 2018-11-19 Bin Cheng <bin.cheng@linux.alibaba.com> > > > > * profile-count.h (profile_count::operator<, >, <=): Check ZERO count > > using nonzero_p. > > (profile_count::oeprator>=): Invert return condition when *this is > > precise zero. Check ZERO count in that condition using nonzero_p. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-11-28 16:20 ` Jan Hubicka @ 2018-12-04 8:40 ` Bin.Cheng 2018-12-07 10:00 ` Bin.Cheng 0 siblings, 1 reply; 15+ messages in thread From: Bin.Cheng @ 2018-12-04 8:40 UTC (permalink / raw) To: Jan Hubicka; +Cc: bin.cheng, Richard Guenther, gcc-patches List On Thu, Nov 29, 2018 at 12:20 AM Jan Hubicka <hubicka@ucw.cz> wrote: > > > On Tue, Nov 20, 2018 at 6:55 PM bin.cheng <bin.cheng@linux.alibaba.com> wrote: > > > > > > Sender:Jan Hubicka <hubicka@ucw.cz> > > > Sent at:2018 Nov 5 (Mon) 22:21 > > > To:Richard Biener <richard.guenther@gmail.com> > > > Cc:bin.cheng <bin.cheng@linux.alibaba.com>; GCC Patches <gcc-patches@gcc.gnu.org> > > > Subject:Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count > > > > > > > > > > > > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.cheng@linux.alibaba.com> wrote: > > > > > > > > > > > > Hi, > > > > > > In new profile probability/count infra, we have different precision quality categories, > > > > > > and probabilities/counts of different categories are not supposed to be compared or > > > > > > calculated. Though in general is an improvement, it introduces unexpected behavior. > > > > > > Specifically, class profile_probablity and profile_count themselves are implemented > > > > > > by comparing probabilities/counts against profile_count::zero(). while zero() is of > > > > > > profile_precision category, it's always compared different to zero of other precision > > > > > > categories including afdo. > > > > > > > > > > > > I can see two ways fixing this: 1) Treat zero as a common probability/count regardless > > > > > > of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison > > > > > > against probability_count::zero(). 2) requires lots of code changes so I went with 1) > > > > > > in this patch set. This patch doesn't handle "always" but it might be. > > > > > > > > > > > > This patch also corrects a minor issue where we try to invert an uninitialized value. > > > > > > > > > > > > Bootstrap and test on x86_64 in patch set. Is it OK? > > > > > > > > > > I'll defer on the emit_store_flag_force change, likewise for the zero > > > > > handling in > > > > > compares - I don't think zeros of different qualities should compare equal. > > > > > Would compares against ::always() not have the very same issue? > > > > > Likewise ::even(), > > > > > ::likely(), etc.? Those always get guessed quality. > > > > > > > > > > The invert change looks OK to me. The related change to the always() API would > > > > > suggest to replace guessed_always() with always (guessed) and also do similar > > > > > changes throughout the whole API... > > > > > > > > > > Honza? > > > > > > > > The zeros are really differenct zeros. profile_count::zero makes us to > > > > drop the basic block into cold section because we know that it won't be > > > > executed in normal run of program (either we have accurate profile > > > > feedback or by proving that the program is on way to crash or user > > > > annotated cold section). Having guessed zero or auto-fdo zero won't > > > > make us to do such agressive size optimization. > > > > This is important since those zeros relatively commonly happens by > > > > accident and thus if we dropped all the code to cold section the cold > > > > section would be visited relativel often during execution of program > > > > which would eliminate its need. > > > > > > > > Most comparsion in profile-count.h which goes agains profile_count==zero > > > > are realy intended to pass only for this "aboslute zero". They bypass > > > > the precision adjusmtents which normally happen when you merge values > > > > of different precision. > > > > > > > > What kind of unexpected behaviour are you seeing? > > > > We already have nonzero_p which is what we use when we want to know that > > > > count is non-zero in some sense of precision. > > > Hi Honza, > > > Sorry for letting this slip away. So in case of AutoFDO, due to the nature > > > of sampling, lots of funcs/bbs are annotated with zero profile_count in afdo > > > precision, and we have checks against zero profile_count in precise precision > > > All these checks end up with false and cause issues. Take the code in > > > update_profiling_info as an example: > > > > > > update_profiling_info (struct cgraph_node *orig_node, > > > struct cgraph_node *new_node) > > > { > > > struct cgraph_edge *cs; > > > struct caller_statistics stats; > > > profile_count new_sum, orig_sum; > > > profile_count remainder, orig_node_count = orig_node->count; > > > > > > if (!(orig_node_count.ipa () > profile_count::zero ())) > > > return; > > > //... > > > for (cs = new_node->callees; cs; cs = cs->next_callee) > > > cs->count = cs->count.apply_scale (new_sum, orig_node_count); > > > > > > Since we also have below code in profile_count::operator>, > > > if (other == profile_count::zero ()) > > > return !(*this == profile_count::zero ()); > > > > > > If orig_node_count is afdo zero, the above zero check for orig_node_count > > > returns false, we end up with passing zero density to apply_scale issue and > > > asserting. > > > > > > In this updated patch, I restrcited changes only to profile_count::operator > > > <, >, <= and >=. Plus, I think there is a latent typo in operator>= because > > > current code return TRUE if '*this' is precise zero and 'other' is precise > > > non-zero. > > > @@ -879,7 +879,7 @@ public: > > > if (other == profile_count::zero ()) > > > return true; > > > if (*this == profile_count::zero ()) > > > - return !(other == profile_count::zero ()); > > > + return !other.nonzero_p (); > > We already have > > True: > profile_count::zero < any other value > any other value > profile_count::zero > profile_count::zero <= any initialized value > profile_count::zero <= profile_count::zero > any initialized value >= profile_count::zero > > false > profile_count::zero > any other value > any other value < profile_count::zero > > You are right about typo in >=, it should be: > > Index: profile-count.h > =================================================================== > --- profile-count.h (revision 266450) > +++ profile-count.h (working copy) > @@ -879,7 +879,7 @@ > if (other == profile_count::zero ()) > return true; > if (*this == profile_count::zero ()) > - return !(other == profile_count::zero ()); > + return other == profile_count::zero (); > gcc_checking_assert (compatible_p (other)); > return m_val >= other.m_val; > } > > With your patch we get false for: > profile_count::zero < guessed/auto_fdo/other 0 > guessed/auto_fdo/other > profile_count::zero > guessed/auto_fdo/other <= profile_count::zero > profile_count::zero >= profile_count::zero > > The original idea was to intentionally make profile_count::zero smaller > than any toher types of initialized values, since it is more strict hint > that the path will not be taken. > For example in bb_reorder if you end up with "funny" profile with two > exit edges one having profile_count::zero and other being zero as result > of (unsucesfull) profile updates it is still better idea to pick the > profile_count::zero for taken edge. With your patch it will end up > picking either of the paths. > > How the patch helps to your situation? Hi Honza, thanks very much for elaborating. Issue in case of autofdo is as described in last message: Given update_profiling_info implemented as below: update_profiling_info (struct cgraph_node *orig_node, struct cgraph_node *new_node) { struct cgraph_edge *cs; struct caller_statistics stats; profile_count new_sum, orig_sum; profile_count remainder, orig_node_count = orig_node->count; //*****Operator ">" returns true if orig_node_count == autofdo.zero. if (!(orig_node_count.ipa () > profile_count::zero ())) return; //... for (cs = new_node->callees; cs; cs = cs->next_callee) //*****Result in apply_scale called with autofdo.zero as the 2nd argument. cs->count = cs->count.apply_scale (new_sum, orig_node_count); Also apply_scale is implemented as: profile_count apply_scale (profile_count num, profile_count den) const { if (*this == profile_count::zero ()) return *this; if (num == profile_count::zero ()) return num; if (!initialized_p () || !num.initialized_p () || !den.initialized_p ()) return profile_count::uninitialized (); if (num == den) return *this; gcc_checking_assert (den.m_val); Here we have (num != zero && den == autofdo.zero), it triggers the gcc_checking_assert. According to your explanation, guess we need to call force_nonzero for orig_node_count before calling apply_scale, right? Thanks, bin ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-12-04 8:40 ` Bin.Cheng @ 2018-12-07 10:00 ` Bin.Cheng 2018-12-07 16:57 ` Jan Hubicka 0 siblings, 1 reply; 15+ messages in thread From: Bin.Cheng @ 2018-12-07 10:00 UTC (permalink / raw) To: Jan Hubicka; +Cc: bin.cheng, Richard Guenther, gcc-patches List On Tue, Dec 4, 2018 at 4:40 PM Bin.Cheng <amker.cheng@gmail.com> wrote: > > On Thu, Nov 29, 2018 at 12:20 AM Jan Hubicka <hubicka@ucw.cz> wrote: > > > > > On Tue, Nov 20, 2018 at 6:55 PM bin.cheng <bin.cheng@linux.alibaba.com> wrote: > > > > > > > > Sender:Jan Hubicka <hubicka@ucw.cz> > > > > Sent at:2018 Nov 5 (Mon) 22:21 > > > > To:Richard Biener <richard.guenther@gmail.com> > > > > Cc:bin.cheng <bin.cheng@linux.alibaba.com>; GCC Patches <gcc-patches@gcc.gnu.org> > > > > Subject:Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count > > > > > > > > > > > > > > > On Wed, Oct 31, 2018 at 7:30 AM bin.cheng <bin.cheng@linux.alibaba.com> wrote: > > > > > > > > > > > > > > Hi, > > > > > > > In new profile probability/count infra, we have different precision quality categories, > > > > > > > and probabilities/counts of different categories are not supposed to be compared or > > > > > > > calculated. Though in general is an improvement, it introduces unexpected behavior. > > > > > > > Specifically, class profile_probablity and profile_count themselves are implemented > > > > > > > by comparing probabilities/counts against profile_count::zero(). while zero() is of > > > > > > > profile_precision category, it's always compared different to zero of other precision > > > > > > > categories including afdo. > > > > > > > > > > > > > > I can see two ways fixing this: 1) Treat zero as a common probability/count regardless > > > > > > > of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison > > > > > > > against probability_count::zero(). 2) requires lots of code changes so I went with 1) > > > > > > > in this patch set. This patch doesn't handle "always" but it might be. > > > > > > > > > > > > > > This patch also corrects a minor issue where we try to invert an uninitialized value. > > > > > > > > > > > > > > Bootstrap and test on x86_64 in patch set. Is it OK? > > > > > > > > > > > > I'll defer on the emit_store_flag_force change, likewise for the zero > > > > > > handling in > > > > > > compares - I don't think zeros of different qualities should compare equal. > > > > > > Would compares against ::always() not have the very same issue? > > > > > > Likewise ::even(), > > > > > > ::likely(), etc.? Those always get guessed quality. > > > > > > > > > > > > The invert change looks OK to me. The related change to the always() API would > > > > > > suggest to replace guessed_always() with always (guessed) and also do similar > > > > > > changes throughout the whole API... > > > > > > > > > > > > Honza? > > > > > > > > > > The zeros are really differenct zeros. profile_count::zero makes us to > > > > > drop the basic block into cold section because we know that it won't be > > > > > executed in normal run of program (either we have accurate profile > > > > > feedback or by proving that the program is on way to crash or user > > > > > annotated cold section). Having guessed zero or auto-fdo zero won't > > > > > make us to do such agressive size optimization. > > > > > This is important since those zeros relatively commonly happens by > > > > > accident and thus if we dropped all the code to cold section the cold > > > > > section would be visited relativel often during execution of program > > > > > which would eliminate its need. > > > > > > > > > > Most comparsion in profile-count.h which goes agains profile_count==zero > > > > > are realy intended to pass only for this "aboslute zero". They bypass > > > > > the precision adjusmtents which normally happen when you merge values > > > > > of different precision. > > > > > > > > > > What kind of unexpected behaviour are you seeing? > > > > > We already have nonzero_p which is what we use when we want to know that > > > > > count is non-zero in some sense of precision. > > > > Hi Honza, > > > > Sorry for letting this slip away. So in case of AutoFDO, due to the nature > > > > of sampling, lots of funcs/bbs are annotated with zero profile_count in afdo > > > > precision, and we have checks against zero profile_count in precise precision > > > > All these checks end up with false and cause issues. Take the code in > > > > update_profiling_info as an example: > > > > > > > > update_profiling_info (struct cgraph_node *orig_node, > > > > struct cgraph_node *new_node) > > > > { > > > > struct cgraph_edge *cs; > > > > struct caller_statistics stats; > > > > profile_count new_sum, orig_sum; > > > > profile_count remainder, orig_node_count = orig_node->count; > > > > > > > > if (!(orig_node_count.ipa () > profile_count::zero ())) > > > > return; > > > > //... > > > > for (cs = new_node->callees; cs; cs = cs->next_callee) > > > > cs->count = cs->count.apply_scale (new_sum, orig_node_count); > > > > > > > > Since we also have below code in profile_count::operator>, > > > > if (other == profile_count::zero ()) > > > > return !(*this == profile_count::zero ()); > > > > > > > > If orig_node_count is afdo zero, the above zero check for orig_node_count > > > > returns false, we end up with passing zero density to apply_scale issue and > > > > asserting. > > > > > > > > In this updated patch, I restrcited changes only to profile_count::operator > > > > <, >, <= and >=. Plus, I think there is a latent typo in operator>= because > > > > current code return TRUE if '*this' is precise zero and 'other' is precise > > > > non-zero. > > > > @@ -879,7 +879,7 @@ public: > > > > if (other == profile_count::zero ()) > > > > return true; > > > > if (*this == profile_count::zero ()) > > > > - return !(other == profile_count::zero ()); > > > > + return !other.nonzero_p (); > > > > We already have > > > > True: > > profile_count::zero < any other value > > any other value > profile_count::zero > > profile_count::zero <= any initialized value > > profile_count::zero <= profile_count::zero > > any initialized value >= profile_count::zero > > > > false > > profile_count::zero > any other value > > any other value < profile_count::zero > > > > You are right about typo in >=, it should be: > > > > Index: profile-count.h > > =================================================================== > > --- profile-count.h (revision 266450) > > +++ profile-count.h (working copy) > > @@ -879,7 +879,7 @@ > > if (other == profile_count::zero ()) > > return true; > > if (*this == profile_count::zero ()) > > - return !(other == profile_count::zero ()); > > + return other == profile_count::zero (); > > gcc_checking_assert (compatible_p (other)); > > return m_val >= other.m_val; > > } > > > > With your patch we get false for: > > profile_count::zero < guessed/auto_fdo/other 0 > > guessed/auto_fdo/other > profile_count::zero > > guessed/auto_fdo/other <= profile_count::zero > > profile_count::zero >= profile_count::zero > > > > The original idea was to intentionally make profile_count::zero smaller > > than any toher types of initialized values, since it is more strict hint > > that the path will not be taken. > > For example in bb_reorder if you end up with "funny" profile with two > > exit edges one having profile_count::zero and other being zero as result > > of (unsucesfull) profile updates it is still better idea to pick the > > profile_count::zero for taken edge. With your patch it will end up > > picking either of the paths. > > > > How the patch helps to your situation? > Hi Honza, thanks very much for elaborating. Issue in case of autofdo > is as described in last message: > Given update_profiling_info implemented as below: > > update_profiling_info (struct cgraph_node *orig_node, > struct cgraph_node *new_node) > { > struct cgraph_edge *cs; > struct caller_statistics stats; > profile_count new_sum, orig_sum; > profile_count remainder, orig_node_count = orig_node->count; > > //*****Operator ">" returns true if orig_node_count == autofdo.zero. > if (!(orig_node_count.ipa () > profile_count::zero ())) > return; > //... > for (cs = new_node->callees; cs; cs = cs->next_callee) > //*****Result in apply_scale called with autofdo.zero as the 2nd argument. > cs->count = cs->count.apply_scale (new_sum, orig_node_count); > > Also apply_scale is implemented as: > profile_count apply_scale (profile_count num, profile_count den) const > { > if (*this == profile_count::zero ()) > return *this; > if (num == profile_count::zero ()) > return num; > if (!initialized_p () || !num.initialized_p () || !den.initialized_p ()) > return profile_count::uninitialized (); > if (num == den) > return *this; > gcc_checking_assert (den.m_val); > > Here we have (num != zero && den == autofdo.zero), it triggers the > gcc_checking_assert. > According to your explanation, guess we need to call force_nonzero for > orig_node_count before calling apply_scale, right? Hi Honza, I have committed the typo fix as revision 266885. Also I followed your suggestion (IIUC) by calling profile_count::adjust_for_ipa_scaling for zero den in function update_profiling_info. It works and does make more sense than changing the global zero check logic. Patch tested as before, is it ok? Thanks, bin diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 4471bae11c7..5074ef63da1 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -3715,9 +3715,11 @@ update_profiling_info (struct cgraph_node *orig_node, new_sum = orig_node_count.combine_with_ipa_count (new_sum); orig_node->count = remainder; + profile_count::adjust_for_ipa_scaling (&new_sum, &orig_node_count); for (cs = new_node->callees; cs; cs = cs->next_callee) cs->count = cs->count.apply_scale (new_sum, orig_node_count); + profile_count::adjust_for_ipa_scaling (&remainder, &orig_node_count); for (cs = orig_node->callees; cs; cs = cs->next_callee) cs->count = cs->count.apply_scale (remainder, orig_node_count); 2018-12-07 Bin Cheng <bin.cheng@linux.alibaba.com> * ipa-cp.c (update_profiling_info): Call adjust_for_ipa_scaling for zero profile count. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-12-07 10:00 ` Bin.Cheng @ 2018-12-07 16:57 ` Jan Hubicka 2018-12-09 6:40 ` Bin.Cheng 0 siblings, 1 reply; 15+ messages in thread From: Jan Hubicka @ 2018-12-07 16:57 UTC (permalink / raw) To: Bin.Cheng; +Cc: bin.cheng, Richard Guenther, gcc-patches List > Hi Honza, > I have committed the typo fix as revision 266885. > Also I followed your suggestion (IIUC) by calling > profile_count::adjust_for_ipa_scaling for zero den in function > update_profiling_info. It works and does make more sense than > changing the global zero check logic. > Patch tested as before, is it ok? Thanks, patch is OK. What is situation with AutoFDO now? It would be very nice to get it fixed for the release :) Honza > > Thanks, > bin > > diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c > index 4471bae11c7..5074ef63da1 100644 > --- a/gcc/ipa-cp.c > +++ b/gcc/ipa-cp.c > @@ -3715,9 +3715,11 @@ update_profiling_info (struct cgraph_node *orig_node, > new_sum = orig_node_count.combine_with_ipa_count (new_sum); > orig_node->count = remainder; > > + profile_count::adjust_for_ipa_scaling (&new_sum, &orig_node_count); > for (cs = new_node->callees; cs; cs = cs->next_callee) > cs->count = cs->count.apply_scale (new_sum, orig_node_count); > > + profile_count::adjust_for_ipa_scaling (&remainder, &orig_node_count); > for (cs = orig_node->callees; cs; cs = cs->next_callee) > cs->count = cs->count.apply_scale (remainder, orig_node_count); > > 2018-12-07 Bin Cheng <bin.cheng@linux.alibaba.com> > > * ipa-cp.c (update_profiling_info): Call adjust_for_ipa_scaling for > zero profile count. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-12-07 16:57 ` Jan Hubicka @ 2018-12-09 6:40 ` Bin.Cheng 0 siblings, 0 replies; 15+ messages in thread From: Bin.Cheng @ 2018-12-09 6:40 UTC (permalink / raw) To: Jan Hubicka; +Cc: bin.cheng, Richard Guenther, gcc-patches List On Sat, Dec 8, 2018 at 12:57 AM Jan Hubicka <hubicka@ucw.cz> wrote: > > > Hi Honza, > > I have committed the typo fix as revision 266885. > > Also I followed your suggestion (IIUC) by calling > > profile_count::adjust_for_ipa_scaling for zero den in function > > update_profiling_info. It works and does make more sense than > > changing the global zero check logic. > > Patch tested as before, is it ok? > > Thanks, patch is OK. > What is situation with AutoFDO now? It would be very nice to get it > fixed for the release :) Jeff already approved the major patch, I just need to do minor updates. I will post two more small patches soon. Thanks, bin > > Honza > > > > Thanks, > > bin > > > > diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c > > index 4471bae11c7..5074ef63da1 100644 > > --- a/gcc/ipa-cp.c > > +++ b/gcc/ipa-cp.c > > @@ -3715,9 +3715,11 @@ update_profiling_info (struct cgraph_node *orig_node, > > new_sum = orig_node_count.combine_with_ipa_count (new_sum); > > orig_node->count = remainder; > > > > + profile_count::adjust_for_ipa_scaling (&new_sum, &orig_node_count); > > for (cs = new_node->callees; cs; cs = cs->next_callee) > > cs->count = cs->count.apply_scale (new_sum, orig_node_count); > > > > + profile_count::adjust_for_ipa_scaling (&remainder, &orig_node_count); > > for (cs = orig_node->callees; cs; cs = cs->next_callee) > > cs->count = cs->count.apply_scale (remainder, orig_node_count); > > > > 2018-12-07 Bin Cheng <bin.cheng@linux.alibaba.com> > > > > * ipa-cp.c (update_profiling_info): Call adjust_for_ipa_scaling for > > zero profile count. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-10-31 8:33 [PATCH AutoFDO/2]Treat ZERO as common profile probability/count bin.cheng 2018-10-31 9:43 ` Richard Biener @ 2018-10-31 15:02 ` Jeff Law 2018-11-01 1:11 ` Bin.Cheng 1 sibling, 1 reply; 15+ messages in thread From: Jeff Law @ 2018-10-31 15:02 UTC (permalink / raw) To: bin.cheng, gcc-patches On 10/31/18 12:30 AM, bin.cheng wrote: > Hi, > In new profile probability/count infra, we have different precision quality categories, > and probabilities/counts of different categories are not supposed to be compared or > calculated. Though in general is an improvement, it introduces unexpected behavior. > Specifically, class profile_probablity and profile_count themselves are implemented > by comparing probabilities/counts against profile_count::zero(). while zero() is of > profile_precision category, it's always compared different to zero of other precision > categories including afdo. > > I can see two ways fixing this: 1) Treat zero as a common probability/count regardless > of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison > against probability_count::zero(). 2) requires lots of code changes so I went with 1) > in this patch set. This patch doesn't handle "always" but it might be. > > This patch also corrects a minor issue where we try to invert an uninitialized value. > > Bootstrap and test on x86_64 in patch set. Is it OK? > > Thanks, > bin > > 2018-10-31 Bin Cheng <bin.cheng@linux.alibaba.com> > > * expmed.c (emit_store_flag_force): Use profile_probability::always. > * profile-count.h (profile_probability::always): Add parameter. > (profile_probability::operator==, profile_count::operator==): Treat > ZERO as common probability/count regardless of its quality. > (profile_probability::invert): Don't invert uninitialized probability. > I'm really not sure the emit_store_flag_force change is right -- essentially without external information I can't see how we can pass in any probability here other than "we don't know" which is profile_probability::uninitialized IIUC. You could potentially make an argument that a 50-50 probability is reasonable here. That's profile_probability::even. But I just don't see how profile_probability::always is right here. jeff ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH AutoFDO/2]Treat ZERO as common profile probability/count 2018-10-31 15:02 ` Jeff Law @ 2018-11-01 1:11 ` Bin.Cheng 0 siblings, 0 replies; 15+ messages in thread From: Bin.Cheng @ 2018-11-01 1:11 UTC (permalink / raw) To: Jeff Law; +Cc: bin.cheng, gcc-patches List On Wed, Oct 31, 2018 at 10:36 PM Jeff Law <law@redhat.com> wrote: > > On 10/31/18 12:30 AM, bin.cheng wrote: > > Hi, > > In new profile probability/count infra, we have different precision quality categories, > > and probabilities/counts of different categories are not supposed to be compared or > > calculated. Though in general is an improvement, it introduces unexpected behavior. > > Specifically, class profile_probablity and profile_count themselves are implemented > > by comparing probabilities/counts against profile_count::zero(). while zero() is of > > profile_precision category, it's always compared different to zero of other precision > > categories including afdo. > > > > I can see two ways fixing this: 1) Treat zero as a common probability/count regardless > > of its category; 2) Provide an "is_zero" method rather than relying on "==" comparison > > against probability_count::zero(). 2) requires lots of code changes so I went with 1) > > in this patch set. This patch doesn't handle "always" but it might be. > > > > This patch also corrects a minor issue where we try to invert an uninitialized value. > > > > Bootstrap and test on x86_64 in patch set. Is it OK? > > > > Thanks, > > bin > > > > 2018-10-31 Bin Cheng <bin.cheng@linux.alibaba.com> > > > > * expmed.c (emit_store_flag_force): Use profile_probability::always. > > * profile-count.h (profile_probability::always): Add parameter. > > (profile_probability::operator==, profile_count::operator==): Treat > > ZERO as common probability/count regardless of its quality. > > (profile_probability::invert): Don't invert uninitialized probability. > > > > I'm really not sure the emit_store_flag_force change is right -- > essentially without external information I can't see how we can pass in > any probability here other than "we don't know" which is > profile_probability::uninitialized IIUC. > > You could potentially make an argument that a 50-50 probability is > reasonable here. That's profile_probability::even. But I just don't > see how profile_probability::always is right here. Thanks, I realized I mis-spotted that piece of code. Will discard the part. Thanks, bin > > jeff ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2018-12-09 6:40 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-31 8:33 [PATCH AutoFDO/2]Treat ZERO as common profile probability/count bin.cheng 2018-10-31 9:43 ` Richard Biener 2018-10-31 9:57 ` Bin.Cheng 2018-11-02 5:31 ` bin.cheng 2018-11-05 14:38 ` Jan Hubicka 2018-11-05 14:40 ` Jan Hubicka 2018-11-13 6:58 ` Bin.Cheng [not found] ` <20181105141206.4ncu3s2v2jxv6o54@kam.mff.cuni.cz> 2018-11-20 10:54 ` bin.cheng [not found] ` <CAHFci28CQB3KK+Yp7gb8BR61UaGhAJJ-R1yzZPHxitczvgEB3w@mail.gmail.com> 2018-11-28 16:20 ` Jan Hubicka 2018-12-04 8:40 ` Bin.Cheng 2018-12-07 10:00 ` Bin.Cheng 2018-12-07 16:57 ` Jan Hubicka 2018-12-09 6:40 ` Bin.Cheng 2018-10-31 15:02 ` Jeff Law 2018-11-01 1:11 ` Bin.Cheng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).