public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Tweak predictors based on SPEC2006 and SPEC2017
@ 2018-01-09 11:23 Martin Liška
  2018-01-09 11:26 ` [PATCH 1/5] Fix usage of analyze_brprob.py script Martin Liška
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Martin Liška @ 2018-01-09 11:23 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 6199 bytes --]

Hello.

In current stage1 Honza spent a significant amount of time to improve profiling infrastructure.
New classes profile_count and profile_probability have been added and we hope the profile
is maintained more sensitively among various optimization passes.

My patch series makes adjustments to values of predictors defined in predict.def based on
what I measured on SPEC2006 and SPEC2017. For both, -O2 -march=native was used and it was
run on a Haswell CPU. Note that numbers should be reproducible on any CPU, as coverage mapping
to source files should match.

Compare to previous release (GCC 7.x branch), I made 2 main differences:
1) For GCC 7.x tuning I wrongly used train run to collect numbers -> fixed by running train run
with ref size of benchmarks
2) I noticed we have branches that in some cases dominate a predictor. That's caused by fact that
we calculate average predictor value based on number of successful branching. Imagine a very hot
condition in a loop. Such branch can dominate a predictor. So that I decided to calculate statistics
also without branches that have coverage >= 10%. Doing that shows that some predictors can be close
to 50%, or value chosen in predict.def is very far from what we have considering all branches.
In such cases, I explain why I've chosen to either remove a predictor or adjust based on non-dominating
branches.

The email contains of statistics for SPEC2006 suite, SPEC2017 suite and combined values based on these.
Apart from that, statistics for each individual benchmark can be also found.

There are numbers for 2 configurations of SPEC2006 and SPEC2017 which were run with the series on a Ryzen 5
machine:

+-------------------------------------------+--------+----------+----------+
| 1) ===== CPU2006: -O2 -march=native       |        |          |          |
| Performance Regressions - Execution Time  | Δ (B)  | Baseline | Current  |
| SPEC/SPEC2006/FP/459.GemsFDTD             | 3.78%  | 234.6328 | 243.5055 |
| SPEC/SPEC2006/INT/456.hmmer               | 3.08%  | 312.4193 | 322.0455 |
| SPEC/SPEC2006/FP/482.sphinx3              | 2.26%  | 370.0952 | 378.4484 |
| SPEC/SPEC2006/FP/433.milc                 | 1.95%  | 289.1947 | 294.8295 |
| SPEC/SPEC2006/FP/465.tonto                | 1.70%  | 325.3729 | 330.9081 |
| SPEC/SPEC2006/FP/416.gamess               | 1.63%  | 550.4813 | 559.4563 |
| SPEC/SPEC2006/INT/429.mcf                 | 1.61%  | 245.2728 | 249.2234 |
|                                           |        |          |          |
| Performance Improvements - Execution Time | Δ (B)  | Baseline | Current  |
| SPEC/SPEC2006/FP/436.cactusADM            | -5.94% | 305.2817 | 287.1523 |
| SPEC/SPEC2006/INT/483.xalancbmk           | -2.84% | 200.1121 | 194.4234 |
| SPEC/SPEC2006/INT/401.bzip2               | -1.31% | 378.0334 | 373.0694 |
|                                           |        |          |          |
| 2) ===== CPU2006: -Ofast -march=native    |        |          |          |
| Performance Regressions - Execution Time  | Δ (B)  | Baseline | Current  |
| SPEC/SPEC2006/FP/459.GemsFDTD             | 3.11%  | 210.9732 | 217.5323 |
| SPEC/SPEC2006/INT/429.mcf                 | 1.63%  | 242.1709 | 246.1208 |
| SPEC/SPEC2006/FP/482.sphinx3              | 1.57%  | 270.7089 | 274.9587 |
| SPEC/SPEC2006/INT/400.perlbench           | 1.40%  | 274.1239 | 277.9529 |
| SPEC/SPEC2006/FP/481.wrf                  | 1.33%  | 180.1668 | 182.555  |
|                                           |        |          |          |
| Performance Improvements - Execution Time | Δ (B)  | Baseline | Current  |
| SPEC/SPEC2006/FP/450.soplex               | -4.38% | 214.6197 | 205.2086 |
| SPEC/SPEC2006/FP/436.cactusADM            | -3.24% | 153.1062 | 148.1397 |
| SPEC/SPEC2006/FP/435.gromacs              | -2.94% | 199.6355 | 193.7684 |
| SPEC/SPEC2006/INT/471.omnetpp             | -1.93% | 285.8131 | 280.3005 |
| SPEC/SPEC2006/INT/445.gobmk               | -1.14% | 375.2672 | 370.9727 |
|                                           |        |          |          |
| 3) ===== CPU2017: -O2 -march=native       |        |          |          |
| Performance Regressions - Execution Time  | Δ (B)  | Baseline | Current  |
| SPEC/SPEC2017/INT/557.xz_r                | 2.51%  | 397.0281 | 407.0107 |
| SPEC/SPEC2017/INT/520.omnetpp_r           | 1.69%  | 443.3886 | 450.8843 |
| SPEC/SPEC2017/FP/549.fotonik3d_r          | 1.44%  | 361.6036 | 366.7952 |
| SPEC/SPEC2017/FP/508.namd_r               | 1.07%  | 198.1652 | 200.2947 |
|                                           |        |          |          |
| Performance Improvements - Execution Time | Δ (B)  | Baseline | Current  |
| SPEC/SPEC2017/INT/548.exchange2_r         | -6.78% | 433.2191 | 403.8602 |
| SPEC/SPEC2017/FP/507.cactuBSSN_r          | -1.49% | 223.5739 | 220.2388 |
| SPEC/SPEC2017/FP/511.povray_r             | -1.34% | 483.2885 | 476.8329 |
| SPEC/SPEC2017/INT/500.perlbench_r         | -1.03% | 417.0421 | 412.7557 |
|                                           |        |          |          |
| 4) ===== CPU2017: -Ofast -march=native    |        |          |          |
| Performance Regressions - Execution Time  | Δ (B)  | Baseline | Current  |
| SPEC/SPEC2017/FP/544.nab_r                | 7.67%  | 314.2308 | 338.3466 |
| SPEC/SPEC2017/FP/538.imagick_r            | 7.26%  | 352.263  | 377.8382 |
| SPEC/SPEC2017/FP/526.blender_r            | 1.56%  | 270.6487 | 274.8841 |
| SPEC/SPEC2017/INT/548.exchange2_r         | 1.55%  | 348.1742 | 353.5598 |
| SPEC/SPEC2017/INT/523.xalancbmk_r         | 1.41%  | 298.9601 | 303.1769 |
|                                           |        |          |          |
| Performance Improvements - Execution Time | Δ (B)  | Baseline | Current  |
| SPEC/SPEC2017/INT/505.mcf_r               | -1.14% | 310.5277 | 306.9915 |
| SPEC/SPEC2017/INT/500.perlbench_r         | -1.07% | 413.7797 | 409.3348 |
+-------------------------------------------+--------+----------+----------+

I'm planning to comment each predictor change in upcoming emails which
contain patches.

Apart from that the patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Martin

[-- Attachment #2: 2017-per-benchmark.txt.bz2 --]
[-- Type: application/x-bzip, Size: 11760 bytes --]

[-- Attachment #3: 2006-per-benchmark.txt.bz2 --]
[-- Type: application/x-bzip, Size: 13310 bytes --]

[-- Attachment #4: all.txt --]
[-- Type: text/plain, Size: 7780 bytes --]

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
fp_opcode (on trees)                            1   0.0%      100.00%   99.97% /  99.97%         169555  169.56k   0.0%          90%  10.0% 100%:1
noreturn call                                   6   0.0%       66.67%    9.29% /  97.36%          15332   15.33k   0.0%         100% -90.7% 89%:1
loop iv compare                                87   0.1%       77.01%   80.39% /  80.58%      210991882  210.99M   0.0%                     17%:1
loop guard with recursion                     124   0.1%       89.52%   85.01% /  85.01%    97523698253   97.52G   0.9%          85%   0.0% 76%:4
extra loop exit                               163   0.1%       72.39%   86.89% /  92.62%     2599164174    2.60G   0.0%          83%   3.9% 68%:1
indirect call                                 510   0.4%       52.94%   48.23% /  93.11%    34860084930   34.86G   0.3%          86% -37.8% 83%:3
guess loop iv compare                         596   0.4%       87.08%   94.45% /  95.68%    47594135196   47.59G   0.5%          98%  -3.6% 87%:1
polymorphic call                              653   0.5%       60.64%   49.81% /  85.95%    10184258333   10.18G   0.1%          59%  -9.2% 12%:1
loop exit with recursion                      725   0.5%       72.83%   85.49% /  85.98%   127789677884  127.79G   1.2%          72%  13.5% 65%:4
negative return                               731   0.5%       71.14%   47.90% /  86.56%    14611024514   14.61G   0.1%          98% -50.1% 47%:3
const return                                  827   0.6%       52.84%   83.58% /  95.50%     7104928950    7.10G   0.1%          69%  14.6% 54%:5
recursive call                                978   0.7%       65.64%   63.51% /  87.36%    15734579342   15.73G   0.2%          75% -11.5% 30%:2
null return                                  1361   1.0%       61.94%   86.43% /  92.87%    15032249850   15.03G   0.1%          91%  -4.6% 53%:1
loop guard                                   2667   2.0%       54.59%   73.60% /  83.90%   245866667771  245.87G   2.4%          66%   7.6% 10%:1
opcode values positive (on trees)            4642   3.4%       59.35%   58.76% /  85.69%   378777493600  378.78G   3.6%          64%  -5.2%
loop exit                                    7109   5.3%       76.13%   88.82% /  91.15%  1148890223736    1.15T  11.0%          85%   3.8%
loop iterations                              9760   7.3%       99.94%   80.20% /  80.20%  1688097819329    1.69T  16.2%                     17%:1
pointer (on trees)                          14022  10.4%       58.54%   72.68% /  88.64%   157668907353  157.67G   1.5%          70%   2.7% 10%:1
call                                        22143  16.5%       61.18%   67.34% /  86.99%   311353504956  311.35G   3.0%          67%   0.3%
opcode values nonequal (on trees)           23459  17.4%       67.89%   65.94% /  83.08%   662814065153  662.81G   6.4%          66%  -0.1%
guessed loop iterations                     32907  24.4%       99.81%   92.55% /  92.63%  3684504131107    3.68T  35.3%                    

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
no prediction                               28954  21.5%       40.35%   33.11% /  81.31%  2188332970675    2.19T  21.0%                    
first match                                 50757  37.7%       93.00%   88.31% /  89.09%  6652092023444    6.65T  63.8%                    
DS theory                                   54866  40.8%       62.17%   69.81% /  85.39%  1590538313906    1.59T  15.2%                    
combined                                   134595 100.0%       69.11%   73.91% /  86.89% 10430968318253   10.43T 100.0%                    

 ===== COVERAGE THRESHOLD == 10% =====

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
fp_opcode (on trees)                            1   0.0%      100.00%   99.97% /  99.97%         169555  169.56k   0.0%          90%  10.0%
noreturn call                                   5   0.0%       80.00%   62.06% /  98.17%           1692    1.69k   0.0%         100% -37.9%
loop iv compare                                86   0.1%       76.74%   86.59% /  86.82%      175230548  175.23M   0.0%                    
loop guard with recursion                     120   0.1%       89.17%   72.16% /  72.17%    23498480299   23.50G   0.2%          85% -12.8%
extra loop exit                               162   0.1%       72.22%   66.89% /  84.59%      841523903  841.52M   0.0%          83% -16.1%
indirect call                                 507   0.4%       52.86%   59.42% /  84.43%     5998351879    6.00G   0.1%          86% -26.6%
guess loop iv compare                         595   0.4%       87.06%   63.59% /  72.78%     6373834696    6.37G   0.1%          98% -34.4%
polymorphic call                              652   0.5%       60.74%   56.30% /  84.19%     8995127137    9.00G   0.1%          59%  -2.7%
loop exit with recursion                      721   0.5%       72.68%   77.82% /  79.20%    45263271344   45.26G   0.4%          72%   5.8%
negative return                               728   0.5%       71.15%   51.10% /  93.41%     7718817446    7.72G   0.1%          98% -46.9%
const return                                  822   0.6%       52.55%   65.22% /  91.00%     3285400153    3.29G   0.0%          69%  -3.8%
recursive call                                976   0.7%       65.57%   48.06% /  82.01%    11054338004   11.05G   0.1%          75% -26.9%
null return                                  1360   1.0%       61.91%   71.12% /  84.83%     7062682174    7.06G   0.1%          91% -19.9%
loop guard                                   2666   2.0%       54.58%   70.68% /  82.14%   221068697936  221.07G   2.1%          66%   4.7%
opcode values positive (on trees)            4642   3.4%       59.35%   58.76% /  85.69%   378777493600  378.78G   3.6%          64%  -5.2%
loop exit                                    7109   5.3%       76.13%   88.82% /  91.15%  1148890223736    1.15T  11.0%          85%   3.8%
loop iterations                              9759   7.3%       99.94%   81.27% /  81.27%  1400135529329    1.40T  13.4%                    
pointer (on trees)                          14021  10.4%       58.54%   69.94% /  87.69%   141857126826  141.86G   1.4%          70%  -0.1%
call                                        22143  16.5%       61.18%   67.34% /  86.99%   311353504956  311.35G   3.0%          67%   0.3%
opcode values nonequal (on trees)           23459  17.4%       67.89%   65.94% /  83.08%   662814065153  662.81G   6.4%          66%  -0.1%
guessed loop iterations                     32907  24.4%       99.81%   92.55% /  92.63%  3684504131107    3.68T  35.3%                    

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
no prediction                               28954  21.5%       40.35%   33.11% /  81.31%  2188332970675    2.19T  21.0%                    
first match                                 50757  37.7%       93.00%   88.31% /  89.09%  6652092023444    6.65T  63.8%                    
DS theory                                   54866  40.8%       62.17%   69.81% /  85.39%  1590538313906    1.59T  15.2%                    
combined                                   134595 100.0%       69.11%   73.91% /  86.89% 10430968318253   10.43T 100.0%                    

[-- Attachment #5: 2017-all.txt --]
[-- Type: text/plain, Size: 7780 bytes --]

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
fp_opcode (on trees)                            1   0.0%      100.00%   99.97% /  99.97%         169555  169.56k   0.0%          90%  10.0% 100%:1
noreturn call                                   4   0.0%       50.00%    2.88% /  97.19%          14318   14.32k   0.0%         100% -97.1% 95%:1
loop iv compare                                28   0.0%       85.71%   90.36% /  90.42%      142853110  142.85M   0.0%                     78%:7
loop guard with recursion                      60   0.1%       86.67%   85.00% /  85.01%    97187655258   97.19G   1.9%          85%   0.0% 76%:4
extra loop exit                                87   0.1%       70.11%   90.57% /  95.57%     2304480228    2.30G   0.0%          83%   7.6% 76%:1
guess loop iv compare                         222   0.3%       82.43%   49.49% /  67.79%     2939751990    2.94G   0.1%          98% -48.5% 47%:2
loop exit with recursion                      405   0.5%       72.59%   85.81% /  86.13%   122008014554  122.01G   2.4%          72%  13.8% 68%:4
negative return                               408   0.5%       70.10%   11.53% /  97.37%     4294690284    4.29G   0.1%          98% -86.5% 86%:2
indirect call                                 415   0.5%       51.81%   82.78% /  93.30%    13268053185   13.27G   0.3%          86%  -3.2% 63%:1
const return                                  429   0.5%       50.35%   87.87% /  95.79%     6093205123    6.09G   0.1%          69%  18.9% 74%:6
polymorphic call                              442   0.6%       64.71%   56.07% /  84.59%     5625048806    5.63G   0.1%          59%  -2.9% 43%:3
recursive call                                542   0.7%       66.97%   44.47% /  83.19%     6006719153    6.01G   0.1%          75% -30.5% 10%:1
null return                                   977   1.2%       61.62%   91.99% /  95.61%    10924810708   10.92G   0.2%          91%   1.0% 73%:1
loop guard                                   1345   1.7%       56.51%   74.99% /  81.44%   172002804836  172.00G   3.4%          66%   9.0% 25%:2
opcode values positive (on trees)            2420   3.1%       62.19%   54.73% /  85.64%   281436309157  281.44G   5.6%          64%  -9.3%
loop exit                                    4040   5.2%       75.52%   89.04% /  90.91%   582174829649  582.17G  11.7%          85%   4.0%
loop iterations                              5616   7.2%       99.98%   83.65% /  83.65%   845156713258  845.16G  16.9%                    
pointer (on trees)                           9449  12.1%       57.22%   68.91% /  88.62%    89205915254   89.21G   1.8%          70%  -1.1% 11%:1
opcode values nonequal (on trees)           13266  16.9%       68.97%   64.82% /  84.68%   258325826516  258.33G   5.2%          66%  -1.2%
call                                        14158  18.1%       63.77%   66.51% /  87.25%   208458745217  208.46G   4.2%          67%  -0.5% 10%:1
guessed loop iterations                     18494  23.6%       99.88%   91.98% /  91.98%  1706074764271    1.71T  34.2%                    

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
no prediction                               16937  21.6%       39.19%   39.75% /  81.37%   889936912032  889.94G  17.8%                    
first match                                 28674  36.6%       92.47%   88.88% /  89.41%  3257861669388    3.26T  65.3%                    
DS theory                                   32684  41.7%       62.54%   69.37% /  85.50%   839261965939  839.26G  16.8%                    
combined                                    78304 100.0%       68.45%   76.83% /  87.32%  4987064608146    4.99T 100.0%                    

 ===== COVERAGE THRESHOLD == 10% =====

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
fp_opcode (on trees)                            1   0.0%      100.00%   99.97% /  99.97%         169555  169.56k   0.0%          90%  10.0%
noreturn call                                   3   0.0%       66.67%    5.60% /  95.72%            678   678.00   0.0%         100% -94.4%
loop iv compare                                21   0.0%       80.95%   73.99% /  74.24%       30934006   30.93M   0.0%                    
loop guard with recursion                      56   0.1%       85.71%   71.95% /  71.96%    23162437304   23.16G   0.5%          85% -13.0%
extra loop exit                                86   0.1%       69.77%   71.62% /  92.72%      546839957  546.84M   0.0%          83% -11.4%
guess loop iv compare                         220   0.3%       82.73%   54.00% /  78.71%     1552771353    1.55G   0.0%          98% -44.0%
loop exit with recursion                      401   0.5%       72.32%   77.68% /  78.66%    39481608014   39.48G   0.8%          72%   5.7%
negative return                               406   0.5%       70.44%   75.10% /  89.12%      595240362  595.24M   0.0%          98% -22.9%
indirect call                                 414   0.5%       51.69%   57.08% /  85.30%     4945756572    4.95G   0.1%          86% -28.9%
const return                                  423   0.5%       49.65%   55.49% /  85.51%     1606931929    1.61G   0.0%          69% -13.5%
polymorphic call                              439   0.6%       64.46%   41.53% /  91.82%     3189952499    3.19G   0.1%          59% -17.5%
recursive call                                541   0.7%       67.10%   46.36% /  84.52%     5381178946    5.38G   0.1%          75% -28.6%
null return                                   976   1.2%       61.58%   70.38% /  83.77%     2955243032    2.96G   0.1%          91% -20.6%
loop guard                                   1343   1.7%       56.44%   69.15% /  77.77%   128679704564  128.68G   2.6%          66%   3.1%
opcode values positive (on trees)            2420   3.1%       62.19%   54.73% /  85.64%   281436309157  281.44G   5.6%          64%  -9.3%
loop exit                                    4040   5.2%       75.52%   89.04% /  90.91%   582174829649  582.17G  11.7%          85%   4.0%
loop iterations                              5616   7.2%       99.98%   83.65% /  83.65%   845156713258  845.16G  16.9%                    
pointer (on trees)                           9448  12.1%       57.22%   67.62% /  89.69%    79627977265   79.63G   1.6%          70%  -2.4%
opcode values nonequal (on trees)           13266  16.9%       68.97%   64.82% /  84.68%   258325826516  258.33G   5.2%          66%  -1.2%
call                                        14157  18.1%       63.76%   63.45% /  86.54%   187217992716  187.22G   3.8%          67%  -3.5%
guessed loop iterations                     18494  23.6%       99.88%   91.98% /  91.98%  1706074764271    1.71T  34.2%                    

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
no prediction                               16937  21.6%       39.19%   39.75% /  81.37%   889936912032  889.94G  17.8%                    
first match                                 28674  36.6%       92.47%   88.88% /  89.41%  3257861669388    3.26T  65.3%                    
DS theory                                   32684  41.7%       62.54%   69.37% /  85.50%   839261965939  839.26G  16.8%                    
combined                                    78304 100.0%       68.45%   76.83% /  87.32%  4987064608146    4.99T 100.0%                    

[-- Attachment #6: 2006-all.txt --]
[-- Type: text/plain, Size: 7500 bytes --]

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
noreturn call                                   2   0.0%      100.00%   99.80% /  99.80%           1014    1.01k   0.0%         100%  -0.2% 100%:1
loop iv compare                                59   0.1%       72.88%   59.48% /  59.95%       68138772   68.14M   0.0%                     82%:3
loop guard with recursion                      64   0.1%       92.19%   86.13% /  86.35%      336042995  336.04M   0.0%          85%   1.1% 39%:2
extra loop exit                                76   0.1%       75.00%   58.11% /  69.50%      294683946  294.68M   0.0%          83% -24.9% 69%:2
indirect call                                  95   0.2%       57.89%   27.00% /  92.99%    21592031745   21.59G   0.4%          86% -59.0% 95%:2
polymorphic call                              211   0.4%       52.13%   42.09% /  87.63%     4559209527    4.56G   0.1%          59% -16.9% 63%:3
loop exit with recursion                      320   0.6%       73.12%   78.78% /  82.88%     5781663330    5.78G   0.1%          72%   6.8% 32%:3
negative return                               323   0.6%       72.45%   63.05% /  82.06%    10316334230   10.32G   0.2%          98% -35.0% 78%:5
guess loop iv compare                         374   0.7%       89.84%   97.41% /  97.51%    44654383206   44.65G   0.8%          98%  -0.6% 92%:1
null return                                   384   0.7%       62.76%   71.66% /  85.59%     4107439142    4.11G   0.1%          91% -19.3% 28%:2
const return                                  398   0.7%       55.53%   57.74% /  93.78%     1011723827    1.01G   0.0%          69% -11.3% 58%:2
recursive call                                436   0.8%       63.99%   75.27% /  89.93%     9727860189    9.73G   0.2%          75%   0.3% 48%:2
loop guard                                   1322   2.3%       52.65%   70.36% /  89.63%    73863862935   73.86G   1.4%          66%   4.4% 44%:2
opcode values positive (on trees)            2222   3.9%       56.26%   70.43% /  85.83%    97341184443   97.34G   1.8%          64%   6.4%
loop exit                                    3069   5.5%       76.93%   88.59% /  91.39%   566715394087  566.72G  10.4%          85%   3.6%
loop iterations                              4144   7.4%       99.88%   76.75% /  76.75%   842941106071  842.94G  15.5%                     46%:2
pointer (on trees)                           4573   8.1%       61.27%   77.58% /  88.67%    68462992099   68.46G   1.3%          70%   7.6% 23%:1
call                                         7985  14.2%       56.61%   69.01% /  86.45%   102894759739  102.89G   1.9%          67%   2.0%
opcode values nonequal (on trees)           10193  18.1%       66.48%   66.65% /  82.05%   404488238637  404.49G   7.4%          66%   0.6%
guessed loop iterations                     14413  25.6%       99.74%   93.04% /  93.19%  1978429366836    1.98T  36.3%                    

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
no prediction                               12017  21.3%       41.97%   28.56% /  81.27%  1298396058643    1.30T  23.9%                     11%:1
first match                                 22083  39.2%       93.69%   87.77% /  88.79%  3394230354056    3.39T  62.3%                    
DS theory                                   22182  39.4%       61.63%   70.30% /  85.26%   751276347967  751.28G  13.8%                    
combined                                    56291 100.0%       70.02%   71.23% /  86.51%  5443903710107    5.44T 100.0%                    

 ===== COVERAGE THRESHOLD == 10% =====

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
noreturn call                                   1   0.0%      100.00%   50.00% /  50.00%              2     2.00   0.0%         100% -50.0%
loop iv compare                                56   0.1%       73.21%   80.70% /  81.58%       12101759   12.10M   0.0%                    
loop guard with recursion                      62   0.1%       91.94%   86.43% /  86.79%      204613423  204.61M   0.0%          85%   1.4%
extra loop exit                                74   0.1%       74.32%   47.18% /  83.86%       91517959   91.52M   0.0%          83% -35.8%
indirect call                                  93   0.2%       58.06%   70.44% /  80.36%     1052595307    1.05G   0.0%          86% -15.6%
polymorphic call                              208   0.4%       51.92%   36.67% /  90.91%     1668212873    1.67G   0.0%          59% -22.3%
loop exit with recursion                      317   0.6%       72.87%   78.43% /  84.42%     3949926524    3.95G   0.1%          72%   6.4%
negative return                               318   0.6%       72.64%   88.38% /  93.21%     2259767195    2.26G   0.0%          98%  -9.6%
guess loop iv compare                         373   0.7%       89.81%   75.66% /  77.06%     3434082706    3.43G   0.1%          98% -22.3%
null return                                   382   0.7%       62.57%   72.54% /  91.82%     2968314759    2.97G   0.1%          91% -18.5%
const return                                  396   0.7%       55.56%   84.97% /  91.03%      425889042  425.89M   0.0%          69%  16.0%
recursive call                                434   0.8%       63.82%   52.34% /  80.60%     5047618851    5.05G   0.1%          75% -22.7%
loop guard                                   1320   2.3%       52.65%   65.86% /  81.79%    41342679266   41.34G   0.8%          66%  -0.1%
opcode values positive (on trees)            2222   3.9%       56.26%   70.43% /  85.83%    97341184443   97.34G   1.8%          64%   6.4%
loop exit                                    3069   5.5%       76.93%   88.59% /  91.39%   566715394087  566.72G  10.4%          85%   3.6%
loop iterations                              4142   7.4%       99.88%   78.21% /  78.21%   458991386071  458.99G   8.4%                    
pointer (on trees)                           4572   8.1%       61.26%   71.69% /  86.11%    52651211572   52.65G   1.0%          70%   1.7%
call                                         7985  14.2%       56.61%   69.01% /  86.45%   102894759739  102.89G   1.9%          67%   2.0%
opcode values nonequal (on trees)           10193  18.1%       66.48%   66.65% /  82.05%   404488238637  404.49G   7.4%          66%   0.6%
guessed loop iterations                     14413  25.6%       99.74%   93.04% /  93.19%  1978429366836    1.98T  36.3%                    

HEURISTICS                               BRANCHES  (REL)  BR. HITRATE            HITRATE       COVERAGE COVERAGE  (REL)  predict.def  (REL) HOT branches (>10%)
no prediction                               12016  21.3%       41.98%   29.49% /  81.49%  1162005681171    1.16T  21.3%                    
first match                                 22083  39.2%       93.69%   87.77% /  88.79%  3394230354056    3.39T  62.3%                    
DS theory                                   22182  39.4%       61.63%   70.30% /  85.26%   751276347967  751.28G  13.8%                    
combined                                    56291 100.0%       70.02%   71.23% /  86.51%  5443903710107    5.44T 100.0%                    

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] Fix usage of analyze_brprob.py script.
  2018-01-09 11:23 Tweak predictors based on SPEC2006 and SPEC2017 Martin Liška
@ 2018-01-09 11:26 ` Martin Liška
  2018-01-09 11:30 ` [PATCH 2/5] Introduce PROB_UNINITIALIZED constant and use it in, predict.def Martin Liška
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Martin Liška @ 2018-01-09 11:26 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 322 bytes --]

First patch from the series simplifies how we dump and post-process statistics.
I switched to use ';' separated values instead of a complex human language. Changes
to analyze_brprob.py add possibility to filter out dominant edges. Having that one
can see how predictor values change if a dominant edge is ignored.

Martin

[-- Attachment #2: 0001-Fix-usage-of-analyze_brprob.py-script.patch --]
[-- Type: text/x-patch, Size: 9537 bytes --]

From ef3c162961f3599ee65e87ecde5138d9f37f0221 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 21 Dec 2017 17:19:13 +0100
Subject: [PATCH 1/5] Fix usage of analyze_brprob.py script.

contrib/ChangeLog:

2017-12-21  Martin Liska  <mliska@suse.cz>

	* analyze_brprob.py: Support new format that can be easily
	parsed. Add new column to report.

gcc/ChangeLog:

2017-12-21  Martin Liska  <mliska@suse.cz>

	* predict.c (dump_prediction): Add new format for
	analyze_brprob.py script which is enabled with -details
	suboption.
	* profile-count.h (precise_p): New function.
---
 contrib/analyze_brprob.py | 103 ++++++++++++++++++++++++++++++++--------------
 gcc/predict.c             |  13 ++++++
 gcc/profile-count.h       |   5 +++
 3 files changed, 90 insertions(+), 31 deletions(-)

diff --git a/contrib/analyze_brprob.py b/contrib/analyze_brprob.py
index e03d1da1cde..de5f474d629 100755
--- a/contrib/analyze_brprob.py
+++ b/contrib/analyze_brprob.py
@@ -71,6 +71,7 @@ from math import *
 
 counter_aggregates = set(['combined', 'first match', 'DS theory',
     'no prediction'])
+hot_threshold = 10
 
 def percentage(a, b):
     return 100.0 * a / b
@@ -131,47 +132,87 @@ class PredictDefFile:
             with open(self.path, 'w+') as f:
                 for l in modified_lines:
                     f.write(l + '\n')
+class Heuristics:
+    def __init__(self, count, hits, fits):
+        self.count = count
+        self.hits = hits
+        self.fits = fits
 
 class Summary:
     def __init__(self, name):
         self.name = name
-        self.branches = 0
-        self.successfull_branches = 0
-        self.count = 0
-        self.hits = 0
-        self.fits = 0
+        self.edges= []
+
+    def branches(self):
+        return len(self.edges)
+
+    def hits(self):
+        return sum([x.hits for x in self.edges])
+
+    def fits(self):
+        return sum([x.fits for x in self.edges])
+
+    def count(self):
+        return sum([x.count for x in self.edges])
+
+    def successfull_branches(self):
+        return len([x for x in self.edges if 2 * x.hits >= x.count])
 
     def get_hitrate(self):
-        return 100.0 * self.hits / self.count
+        return 100.0 * self.hits() / self.count()
 
     def get_branch_hitrate(self):
-        return 100.0 * self.successfull_branches / self.branches
+        return 100.0 * self.successfull_branches() / self.branches()
 
     def count_formatted(self):
-        v = self.count
+        v = self.count()
         for unit in ['', 'k', 'M', 'G', 'T', 'P', 'E', 'Z', 'Y']:
             if v < 1000:
                 return "%3.2f%s" % (v, unit)
             v /= 1000.0
         return "%.1f%s" % (v, 'Y')
 
+    def count(self):
+        return sum([x.count for x in self.edges])
+
     def print(self, branches_max, count_max, predict_def):
+        # filter out most hot edges (if requested)
+        self.edges = sorted(self.edges, reverse = True, key = lambda x: x.count)
+        if args.coverage_threshold != None:
+            threshold = args.coverage_threshold * self.count() / 100
+            edges = [x for x in self.edges if x.count < threshold]
+            if len(edges) != 0:
+                self.edges = edges
+
         predicted_as = None
         if predict_def != None and self.name in predict_def.predictors:
             predicted_as = predict_def.predictors[self.name]
 
         print('%-40s %8i %5.1f%% %11.2f%% %7.2f%% / %6.2f%% %14i %8s %5.1f%%' %
-            (self.name, self.branches,
-                percentage(self.branches, branches_max),
+            (self.name, self.branches(),
+                percentage(self.branches(), branches_max),
                 self.get_branch_hitrate(),
                 self.get_hitrate(),
-                percentage(self.fits, self.count),
-                self.count, self.count_formatted(),
-                percentage(self.count, count_max)), end = '')
+                percentage(self.fits(), self.count()),
+                self.count(), self.count_formatted(),
+                percentage(self.count(), count_max)), end = '')
 
         if predicted_as != None:
             print('%12i%% %5.1f%%' % (predicted_as,
                 self.get_hitrate() - predicted_as), end = '')
+        else:
+            print(' ' * 20, end = '')
+
+        # print details about the most important edges
+        if args.coverage_threshold == None:
+            edges = [x for x in self.edges[:100] if x.count * hot_threshold > self.count()]
+            if args.verbose:
+                for c in edges:
+                    r = 100.0 * c.count / self.count()
+                    print(' %.0f%%:%d' % (r, c.count), end = '')
+            elif len(edges) > 0:
+                print(' %0.0f%%:%d' % (100.0 * sum([x.count for x in edges]) / self.count(), len(edges)), end = '')
+
         print()
 
 class Profile:
@@ -185,33 +226,29 @@ class Profile:
             self.heuristics[name] = Summary(name)
 
         s = self.heuristics[name]
-        s.branches += 1
 
-        s.count += count
         if prediction < 50:
             hits = count - hits
         remaining = count - hits
-        if hits >= remaining:
-            s.successfull_branches += 1
+        fits = max(hits, remaining)
 
-        s.hits += hits
-        s.fits += max(hits, remaining)
+        s.edges.append(Heuristics(count, hits, fits))
 
     def add_loop_niter(self, niter):
         if niter > 0:
             self.niter_vector.append(niter)
 
     def branches_max(self):
-        return max([v.branches for k, v in self.heuristics.items()])
+        return max([v.branches() for k, v in self.heuristics.items()])
 
     def count_max(self):
-        return max([v.count for k, v in self.heuristics.items()])
+        return max([v.count() for k, v in self.heuristics.items()])
 
     def print_group(self, sorting, group_name, heuristics, predict_def):
         count_max = self.count_max()
         branches_max = self.branches_max()
 
-        sorter = lambda x: x.branches
+        sorter = lambda x: x.branches()
         if sorting == 'branch-hitrate':
             sorter = lambda x: x.get_branch_hitrate()
         elif sorting == 'hitrate':
@@ -221,10 +258,10 @@ class Profile:
         elif sorting == 'name':
             sorter = lambda x: x.name.lower()
 
-        print('%-40s %8s %6s %12s %18s %14s %8s %6s %12s %6s' %
+        print('%-40s %8s %6s %12s %18s %14s %8s %6s %12s %6s %s' %
             ('HEURISTICS', 'BRANCHES', '(REL)',
             'BR. HITRATE', 'HITRATE', 'COVERAGE', 'COVERAGE', '(REL)',
-            'predict.def', '(REL)'))
+            'predict.def', '(REL)', 'HOT branches (>%d%%)' % hot_threshold))
         for h in sorted(heuristics, key = sorter):
             h.print(branches_max, count_max, predict_def)
 
@@ -266,19 +303,23 @@ parser.add_argument('-s', '--sorting', dest = 'sorting',
 parser.add_argument('-d', '--def-file', help = 'path to predict.def')
 parser.add_argument('-w', '--write-def-file', action = 'store_true',
     help = 'Modify predict.def file in order to set new numbers')
+parser.add_argument('-c', '--coverage-threshold', type = int,
+    help = 'Ignore edges that have percentage coverage >= coverage-threshold')
+parser.add_argument('-v', '--verbose', action = 'store_true', help = 'Print verbose informations')
 
 args = parser.parse_args()
 
 profile = Profile(args.dump_file)
-r = re.compile('  (.*) heuristics( of edge [0-9]*->[0-9]*)?( \\(.*\\))?: (.*)%.*exec ([0-9]*) hit ([0-9]*)')
 loop_niter_str = ';;  profile-based iteration count: '
+
 for l in open(args.dump_file):
-    m = r.match(l)
-    if m != None and m.group(3) == None:
-        name = m.group(1)
-        prediction = float(m.group(4))
-        count = int(m.group(5))
-        hits = int(m.group(6))
+    if l.startswith(';;heuristics;'):
+        parts = l.strip().split(';')
+        assert len(parts) == 8
+        name = parts[3]
+        prediction = float(parts[6])
+        count = int(parts[4])
+        hits = int(parts[5])
 
         profile.add(name, prediction, count, hits)
     elif l.startswith(loop_niter_str):
diff --git a/gcc/predict.c b/gcc/predict.c
index 9cea90d1063..3ac18a2c5f2 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -747,6 +747,19 @@ dump_prediction (FILE *file, enum br_predictor predictor, int probability,
     }
 
   fprintf (file, "\n");
+
+  /* Print output that be easily read by analyze_brprob.py script. We are
+     interested only in counts that are read from GCDA files.  */
+  if (dump_file && (dump_flags & TDF_DETAILS)
+      && bb->count.precise_p ()
+      && reason == REASON_NONE)
+    {
+      gcc_assert (e->count ().precise_p ());
+      fprintf (file, ";;heuristics;%s;%" PRId64 ";%" PRId64 ";%.1f;\n",
+	       predictor_info[predictor].name,
+	       bb->count.to_gcov_type (), e->count ().to_gcov_type (),
+	       probability * 100.0 / REG_BR_PROB_BASE);
+    }
 }
 
 /* Return true if STMT is known to be unlikely executed.  */
diff --git a/gcc/profile-count.h b/gcc/profile-count.h
index 3c5f720ee81..3a37b1293e5 100644
--- a/gcc/profile-count.h
+++ b/gcc/profile-count.h
@@ -689,6 +689,11 @@ public:
     {
       return !initialized_p () || m_quality >= profile_guessed_global0;
     }
+  /* Return true if quality of profile is precise.  */
+  bool precise_p () const
+    {
+      return m_quality == profile_precise;
+    }
 
   /* When merging basic blocks, the two different profile counts are unified.
      Return true if this can be done without losing info about profile.
-- 
2.14.3


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/5] Introduce PROB_UNINITIALIZED constant and use it in, predict.def.
  2018-01-09 11:23 Tweak predictors based on SPEC2006 and SPEC2017 Martin Liška
  2018-01-09 11:26 ` [PATCH 1/5] Fix usage of analyze_brprob.py script Martin Liška
@ 2018-01-09 11:30 ` Martin Liška
  2018-01-11 10:44   ` Jan Hubicka
  2018-01-09 11:40 ` [PATCH 3/5] Adjust predictor values according to SPEC2006 and, SPEC2017 Martin Liška
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Martin Liška @ 2018-01-09 11:30 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 318 bytes --]

Second patch cleans up predictors that do not use a value from predict.def,
but their value is derived from e.g. number of iterations of a loop.
These predictors have set probability to PROB_UNINITIALIZED and analyze_branch.py
does not compare statistics to values defined in predict.def.

The patch is no-op.

Martin

[-- Attachment #2: 0002-Introduce-PROB_UNINITIALIZED-constant-and-use-it-in-.patch --]
[-- Type: text/x-patch, Size: 4082 bytes --]

From 6f6f2d88d3141b1e1604698abf857fffbb42330e Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Wed, 27 Dec 2017 14:49:20 +0100
Subject: [PATCH 2/5] Introduce PROB_UNINITIALIZED constant and use it in
 predict.def.

gcc/ChangeLog:

2017-12-28  Martin Liska  <mliska@suse.cz>

	* predict.c (predict_insn_def): Add new assert.
	(struct branch_predictor): Change type to signed integer.
	(test_prediction_value_range): Amend test to cover
	PROB_UNINITIALIZED.
	* predict.def (PRED_LOOP_ITERATIONS): Use the new constant.
	(PRED_LOOP_ITERATIONS_GUESSED): Likewise.
	(PRED_LOOP_ITERATIONS_MAX): Likewise.
	(PRED_LOOP_IV_COMPARE): Likewise.
	* predict.h (PROB_UNINITIALIZED): Define new constant.
---
 gcc/predict.c   | 6 +++++-
 gcc/predict.def | 8 ++++----
 gcc/predict.h   | 1 +
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/gcc/predict.c b/gcc/predict.c
index 3ac18a2c5f2..51fd14205c2 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -545,6 +545,7 @@ predict_insn_def (rtx_insn *insn, enum br_predictor predictor,
 		  enum prediction taken)
 {
    int probability = predictor_info[(int) predictor].hitrate;
+   gcc_assert (probability != PROB_UNINITIALIZED);
 
    if (taken != TAKEN)
      probability = REG_BR_PROB_BASE - probability;
@@ -4185,7 +4186,7 @@ namespace selftest {
 struct branch_predictor
 {
   const char *name;
-  unsigned probability;
+  int probability;
 };
 
 #define DEF_PREDICTOR(ENUM, NAME, HITRATE, FLAGS) { NAME, HITRATE },
@@ -4200,6 +4201,9 @@ test_prediction_value_range ()
 
   for (unsigned i = 0; predictors[i].name != NULL; i++)
     {
+      if (predictors[i].probability == PROB_UNINITIALIZED)
+	continue;
+
       unsigned p = 100 * predictors[i].probability / REG_BR_PROB_BASE;
       ASSERT_TRUE (p > 50 && p <= 100);
     }
diff --git a/gcc/predict.def b/gcc/predict.def
index 0f37e399312..390b9a35fa7 100644
--- a/gcc/predict.def
+++ b/gcc/predict.def
@@ -54,7 +54,7 @@ DEF_PREDICTOR (PRED_UNCONDITIONAL, "unconditional jump", PROB_ALWAYS,
 /* Use number of loop iterations determined by # of iterations
    analysis to set probability.  We don't want to use Dempster-Shaffer
    theory here, as the predictions is exact.  */
-DEF_PREDICTOR (PRED_LOOP_ITERATIONS, "loop iterations", PROB_ALWAYS,
+DEF_PREDICTOR (PRED_LOOP_ITERATIONS, "loop iterations", PROB_UNINITIALIZED,
 	       PRED_FLAG_FIRST_MATCH)
 
 /* Assume that any given atomic operation has low contention,
@@ -71,11 +71,11 @@ DEF_PREDICTOR (PRED_BUILTIN_EXPECT, "__builtin_expect", PROB_VERY_LIKELY,
 
 /* Use number of loop iterations guessed by the contents of the loop.  */
 DEF_PREDICTOR (PRED_LOOP_ITERATIONS_GUESSED, "guessed loop iterations",
-	       PROB_ALWAYS, PRED_FLAG_FIRST_MATCH)
+	       PROB_UNINITIALIZED, PRED_FLAG_FIRST_MATCH)
 
 /* Use number of loop iterations guessed by the contents of the loop.  */
 DEF_PREDICTOR (PRED_LOOP_ITERATIONS_MAX, "guessed loop iterations",
-	       PROB_ALWAYS, PRED_FLAG_FIRST_MATCH)
+	       PROB_UNINITIALIZED, PRED_FLAG_FIRST_MATCH)
 
 /* Branch containing goto is probably not taken.  */
 DEF_PREDICTOR (PRED_CONTINUE, "continue", HITRATE (67), 0)
@@ -151,7 +151,7 @@ DEF_PREDICTOR (PRED_LOOP_IV_COMPARE_GUESS, "guess loop iv compare",
 
 /* Use number of loop iterations determined by # of iterations analysis
    to set probability of branches that compares IV to loop bound variable.  */
-DEF_PREDICTOR (PRED_LOOP_IV_COMPARE, "loop iv compare", PROB_VERY_LIKELY,
+DEF_PREDICTOR (PRED_LOOP_IV_COMPARE, "loop iv compare", PROB_UNINITIALIZED,
 	       PRED_FLAG_FIRST_MATCH)
 
 /* In the following code
diff --git a/gcc/predict.h b/gcc/predict.h
index 57715159b95..e4d1da090ca 100644
--- a/gcc/predict.h
+++ b/gcc/predict.h
@@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
 #define PROB_ALWAYS		(REG_BR_PROB_BASE)
 #define PROB_UNLIKELY           (REG_BR_PROB_BASE / 5 - 1)
 #define PROB_LIKELY             (REG_BR_PROB_BASE - PROB_UNLIKELY)
+#define PROB_UNINITIALIZED      (-1)
 
 #define DEF_PREDICTOR(ENUM, NAME, HITRATE, FLAGS) ENUM,
 enum br_predictor
-- 
2.14.3


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/5] Adjust predictor values according to SPEC2006 and, SPEC2017.
  2018-01-09 11:23 Tweak predictors based on SPEC2006 and SPEC2017 Martin Liška
  2018-01-09 11:26 ` [PATCH 1/5] Fix usage of analyze_brprob.py script Martin Liška
  2018-01-09 11:30 ` [PATCH 2/5] Introduce PROB_UNINITIALIZED constant and use it in, predict.def Martin Liška
@ 2018-01-09 11:40 ` Martin Liška
  2018-01-09 15:12   ` Jan Hubicka
  2018-01-09 11:46 ` [PATCH 4/5] Remove predictors that are unrealiable Martin Liška
  2018-01-09 11:48 ` [PATCH 5/5] Fix test-suite fallout Martin Liška
  4 siblings, 1 reply; 15+ messages in thread
From: Martin Liška @ 2018-01-09 11:40 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 830 bytes --]

Third part changes predictors values:

1) PRED_LOOP_EXIT: no dominant branch; value simply taken from statistics
2) PRED_LOOP_EXIT_WITH_RECURSION: there are 4 dominant edges; value taken w/o these edges
3) PRED_LOOP_EXTRA_EXIT: there's one really dominant edge; value taken w/o the edge; note that
coverage of the predictor is quite small
4) PRED_OPCODE_POSITIVE: no dominant branch; value simply taken from statistics
5) PRED_CONST_RETURN: there are 4 dominant edges, value taken w/o these edges
6) PRED_NULL_RETURN: one dominant edge, value taken w/o these edges
7) PRED_LOOP_IV_COMPARE_GUESS: very fragile predictor (according how is identified in predict.c);
has a dominant edge; value taken w/o these edges; in future I plan to investigate it
8) PRED_LOOP_GUARD: adjusted based on numbers without a one dominant edge

Martin

[-- Attachment #2: 0003-Adjust-predictor-values-according-to-SPEC2006-and-SP.patch --]
[-- Type: text/x-patch, Size: 4135 bytes --]

From 960f16a6e3e916524d8881f53913c15a3c2ec2ae Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 28 Dec 2017 10:13:50 +0100
Subject: [PATCH 3/5] Adjust predictor values according to SPEC2006 and
 SPEC2017.

gcc/ChangeLog:

2017-12-28  Martin Liska  <mliska@suse.cz>

	* predict.def (PRED_LOOP_EXIT): Change from 85 to 89.
	(PRED_LOOP_EXIT_WITH_RECURSION): Change from 72 to 78.
	(PRED_LOOP_EXTRA_EXIT): Change from 83 to 67.
	(PRED_OPCODE_POSITIVE): Change from 64 to 59.
	(PRED_TREE_OPCODE_POSITIVE): Change from 64 to 59.
	(PRED_CONST_RETURN): Change from 69 to 65.
	(PRED_NULL_RETURN): Change from 91 to 71.
	(PRED_LOOP_IV_COMPARE_GUESS): Change from 98 to 64.
	(PRED_LOOP_GUARD): Change from 66 to 73.
---
 gcc/predict.def | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/predict.def b/gcc/predict.def
index 390b9a35fa7..fe72080d5bd 100644
--- a/gcc/predict.def
+++ b/gcc/predict.def
@@ -89,16 +89,16 @@ DEF_PREDICTOR (PRED_COLD_FUNCTION, "cold function call", PROB_VERY_LIKELY,
 	       PRED_FLAG_FIRST_MATCH)
 
 /* Edge causing loop to terminate is probably not taken.  */
-DEF_PREDICTOR (PRED_LOOP_EXIT, "loop exit", HITRATE (85),
+DEF_PREDICTOR (PRED_LOOP_EXIT, "loop exit", HITRATE (89),
 	       PRED_FLAG_FIRST_MATCH)
 
 /* Same as LOOP_EXIT but for loops containing recursive call.  */
 DEF_PREDICTOR (PRED_LOOP_EXIT_WITH_RECURSION, "loop exit with recursion",
-	       HITRATE (72), PRED_FLAG_FIRST_MATCH)
+	       HITRATE (78), PRED_FLAG_FIRST_MATCH)
 
 /* Edge causing loop to terminate by computing value used by later
    conditional.  */
-DEF_PREDICTOR (PRED_LOOP_EXTRA_EXIT, "extra loop exit", HITRATE (83),
+DEF_PREDICTOR (PRED_LOOP_EXTRA_EXIT, "extra loop exit", HITRATE (67),
 	       PRED_FLAG_FIRST_MATCH)
 
 /* Pointers are usually not NULL.  */
@@ -106,11 +106,11 @@ DEF_PREDICTOR (PRED_POINTER, "pointer", HITRATE (70), 0)
 DEF_PREDICTOR (PRED_TREE_POINTER, "pointer (on trees)", HITRATE (70), 0)
 
 /* NE is probable, EQ not etc...  */
-DEF_PREDICTOR (PRED_OPCODE_POSITIVE, "opcode values positive", HITRATE (64), 0)
+DEF_PREDICTOR (PRED_OPCODE_POSITIVE, "opcode values positive", HITRATE (59), 0)
 DEF_PREDICTOR (PRED_OPCODE_NONEQUAL, "opcode values nonequal", HITRATE (66), 0)
 DEF_PREDICTOR (PRED_FPOPCODE, "fp_opcode", HITRATE (90), 0)
 DEF_PREDICTOR (PRED_TREE_OPCODE_POSITIVE, "opcode values positive (on trees)",
-	       HITRATE (64), 0)
+	       HITRATE (59), 0)
 DEF_PREDICTOR (PRED_TREE_OPCODE_NONEQUAL, "opcode values nonequal (on trees)",
 	       HITRATE (66), 0)
 DEF_PREDICTOR (PRED_TREE_FPOPCODE, "fp_opcode (on trees)", HITRATE (90), 0)
@@ -136,18 +136,18 @@ DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
 DEF_PREDICTOR (PRED_GOTO, "goto", HITRATE (66), 0)
 
 /* Branch ending with return constant is probably not taken.  */
-DEF_PREDICTOR (PRED_CONST_RETURN, "const return", HITRATE (69), 0)
+DEF_PREDICTOR (PRED_CONST_RETURN, "const return", HITRATE (65), 0)
 
 /* Branch ending with return negative constant is probably not taken.  */
 DEF_PREDICTOR (PRED_NEGATIVE_RETURN, "negative return", HITRATE (98), 0)
 
 /* Branch ending with return; is probably not taken */
-DEF_PREDICTOR (PRED_NULL_RETURN, "null return", HITRATE (91), 0)
+DEF_PREDICTOR (PRED_NULL_RETURN, "null return", HITRATE (71), 0)
 
 /* Branches to compare induction variable to a loop bound is
    extremely likely.  */
 DEF_PREDICTOR (PRED_LOOP_IV_COMPARE_GUESS, "guess loop iv compare",
-	       HITRATE (98), 0)
+	       HITRATE (64), 0)
 
 /* Use number of loop iterations determined by # of iterations analysis
    to set probability of branches that compares IV to loop bound variable.  */
@@ -160,7 +160,7 @@ DEF_PREDICTOR (PRED_LOOP_IV_COMPARE, "loop iv compare", PROB_UNINITIALIZED,
        for (loop2)
 	 body;
    guess that cond is unlikely.  */
-DEF_PREDICTOR (PRED_LOOP_GUARD, "loop guard", HITRATE (66), 0)
+DEF_PREDICTOR (PRED_LOOP_GUARD, "loop guard", HITRATE (73), 0)
 
 /* Same but for loops containing recursion.  */
 DEF_PREDICTOR (PRED_LOOP_GUARD_WITH_RECURSION, "loop guard with recursion",
-- 
2.14.3


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 4/5] Remove predictors that are unrealiable.
  2018-01-09 11:23 Tweak predictors based on SPEC2006 and SPEC2017 Martin Liška
                   ` (2 preceding siblings ...)
  2018-01-09 11:40 ` [PATCH 3/5] Adjust predictor values according to SPEC2006 and, SPEC2017 Martin Liška
@ 2018-01-09 11:46 ` Martin Liška
  2018-01-11 10:41   ` Jan Hubicka
  2018-01-09 11:48 ` [PATCH 5/5] Fix test-suite fallout Martin Liška
  4 siblings, 1 reply; 15+ messages in thread
From: Martin Liška @ 2018-01-09 11:46 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 516 bytes --]

These predictors are in my opinion not reliable and thus I decided to remove them:

1) PRED_NEGATIVE_RETURN: probability is ~51%
2) PRED_RECURSIVE_CALL: there are 2 dominant edges that influence value to 63%;
w/o these edges it goes down to 52%
3) PRED_POLYMORPHIC_CALL: having very low coverage, probability is ~51%
4) PRED_INDIR_CALL: likewise

Question here is whether we want to remove them, or to predict them with a 'ignored'
flag? Doing that, we can measure statistics of the predictor in the future?

Martin

[-- Attachment #2: 0004-Remove-predictors-that-are-unrealiable.patch --]
[-- Type: text/x-patch, Size: 3904 bytes --]

From afbc86cb72eab37bcf6325954d0bf306b301f76e Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 28 Dec 2017 10:23:48 +0100
Subject: [PATCH 4/5] Remove predictors that are unrealiable.

gcc/ChangeLog:

2017-12-28  Martin Liska  <mliska@suse.cz>

	* predict.c (return_prediction): Do not predict
	PRED_NEGATIVE_RETURN.
	(tree_bb_level_predictions): Do not predict PRED_RECURSIVE_CALL.
	(tree_estimate_probability_bb): Do not predict
	PRED_POLYMORPHIC_CALL and PRED_INDIR_CALL.
	* predict.def (PRED_INDIR_CALL): Remove unused predictors.
	(PRED_POLYMORPHIC_CALL): Likewise.
	(PRED_RECURSIVE_CALL): Likewise.
	(PRED_NEGATIVE_RETURN): Likewise.
---
 gcc/predict.c   | 17 ++---------------
 gcc/predict.def | 13 -------------
 2 files changed, 2 insertions(+), 28 deletions(-)

diff --git a/gcc/predict.c b/gcc/predict.c
index 51fd14205c2..f53724792e9 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -2632,14 +2632,6 @@ return_prediction (tree val, enum prediction *prediction)
     }
   else if (INTEGRAL_TYPE_P (TREE_TYPE (val)))
     {
-      /* Negative return values are often used to indicate
-         errors.  */
-      if (TREE_CODE (val) == INTEGER_CST
-	  && tree_int_cst_sgn (val) < 0)
-	{
-	  *prediction = NOT_TAKEN;
-	  return PRED_NEGATIVE_RETURN;
-	}
       /* Constant return values seems to be commonly taken.
          Zero/one often represent booleans so exclude them from the
 	 heuristics.  */
@@ -2820,9 +2812,6 @@ tree_bb_level_predictions (void)
 				       DECL_ATTRIBUTES (decl)))
 		predict_paths_leading_to (bb, PRED_COLD_FUNCTION,
 					  NOT_TAKEN);
-	      if (decl && recursive_call_p (current_function_decl, decl))
-		predict_paths_leading_to (bb, PRED_RECURSIVE_CALL,
-					  NOT_TAKEN);
 	    }
 	  else if (gimple_code (stmt) == GIMPLE_PREDICT)
 	    {
@@ -2880,12 +2869,10 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
 		     something exceptional.  */
 		  && gimple_has_side_effects (stmt))
 		{
+		  /* Consider just normal function calls, skip indirect and
+		  polymorphic calls as these tend to be unreliable.  */
 		  if (gimple_call_fndecl (stmt))
 		    predict_edge_def (e, PRED_CALL, NOT_TAKEN);
-		  else if (virtual_method_call_p (gimple_call_fn (stmt)))
-		    predict_edge_def (e, PRED_POLYMORPHIC_CALL, NOT_TAKEN);
-		  else
-		    predict_edge_def (e, PRED_INDIR_CALL, TAKEN);
 		  break;
 		}
 	    }
diff --git a/gcc/predict.def b/gcc/predict.def
index fe72080d5bd..7291650d237 100644
--- a/gcc/predict.def
+++ b/gcc/predict.def
@@ -118,16 +118,6 @@ DEF_PREDICTOR (PRED_TREE_FPOPCODE, "fp_opcode (on trees)", HITRATE (90), 0)
 /* Branch guarding call is probably taken.  */
 DEF_PREDICTOR (PRED_CALL, "call", HITRATE (67), 0)
 
-/* PRED_CALL is not very reliable predictor and it turns out to be even
-   less reliable for indirect calls and polymorphic calls.  For spec2k6
-   the predictio nis slightly in the direction of taking the call.  */
-DEF_PREDICTOR (PRED_INDIR_CALL, "indirect call", HITRATE (86), 0)
-DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic call", HITRATE (59), 0)
-
-/* Recursive calls are usually not taken or the function will recurse
-   indefinitely.  */
-DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0)
-
 /* Branch causing function to terminate is probably not taken.  */
 DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
 	       0)
@@ -138,9 +128,6 @@ DEF_PREDICTOR (PRED_GOTO, "goto", HITRATE (66), 0)
 /* Branch ending with return constant is probably not taken.  */
 DEF_PREDICTOR (PRED_CONST_RETURN, "const return", HITRATE (65), 0)
 
-/* Branch ending with return negative constant is probably not taken.  */
-DEF_PREDICTOR (PRED_NEGATIVE_RETURN, "negative return", HITRATE (98), 0)
-
 /* Branch ending with return; is probably not taken */
 DEF_PREDICTOR (PRED_NULL_RETURN, "null return", HITRATE (71), 0)
 
-- 
2.14.3


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 5/5] Fix test-suite fallout.
  2018-01-09 11:23 Tweak predictors based on SPEC2006 and SPEC2017 Martin Liška
                   ` (3 preceding siblings ...)
  2018-01-09 11:46 ` [PATCH 4/5] Remove predictors that are unrealiable Martin Liška
@ 2018-01-09 11:48 ` Martin Liška
  2018-01-12  9:13   ` Jan Hubicka
  4 siblings, 1 reply; 15+ messages in thread
From: Martin Liška @ 2018-01-09 11:48 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 62 bytes --]

Last patch from the series it clean up of test-suite.

Martin

[-- Attachment #2: 0005-Fix-test-suite-fallout.patch --]
[-- Type: text/x-patch, Size: 4557 bytes --]

From e76a4544e236d7d586380f33e20431f854436ce4 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Tue, 9 Jan 2018 10:52:36 +0100
Subject: [PATCH 5/5] Fix test-suite fallout.

gcc/testsuite/ChangeLog:

2018-01-09  Martin Liska  <mliska@suse.cz>

	* gcc.dg/predict-1.c: Adjust expected probability.
	* gcc.dg/predict-10.c: Remove.
	* gcc.dg/predict-12.c: Do not scan for removed predictor.
	* gcc.dg/predict-3.c: Adjust expected probability.
	* gcc.dg/predict-5.c: Likewise.
	* gcc.dg/predict-6.c: Likewise.
	* gcc.dg/predict-9.c: Likewise.
---
 gcc/testsuite/gcc.dg/predict-1.c  |  2 +-
 gcc/testsuite/gcc.dg/predict-10.c | 11 -----------
 gcc/testsuite/gcc.dg/predict-12.c |  1 -
 gcc/testsuite/gcc.dg/predict-3.c  |  2 +-
 gcc/testsuite/gcc.dg/predict-5.c  |  2 +-
 gcc/testsuite/gcc.dg/predict-6.c  |  2 +-
 gcc/testsuite/gcc.dg/predict-9.c  |  4 ++--
 7 files changed, 6 insertions(+), 18 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.dg/predict-10.c

diff --git a/gcc/testsuite/gcc.dg/predict-1.c b/gcc/testsuite/gcc.dg/predict-1.c
index 65f6bad9d7c..4ba26e6e256 100644
--- a/gcc/testsuite/gcc.dg/predict-1.c
+++ b/gcc/testsuite/gcc.dg/predict-1.c
@@ -23,4 +23,4 @@ void foo (int bound)
     }
 }
 
-/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 2.0%" 4 "profile_estimate"} } */
+/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 36.0%" 4 "profile_estimate"} } */
diff --git a/gcc/testsuite/gcc.dg/predict-10.c b/gcc/testsuite/gcc.dg/predict-10.c
deleted file mode 100644
index a99819a24c6..00000000000
--- a/gcc/testsuite/gcc.dg/predict-10.c
+++ /dev/null
@@ -1,11 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
-int
-ee(int i)
-{
-  if (i>2)
-    return (ee(i-1)+ee(i-2))/2;
-  else
-    return i;
-}
-/* { dg-final { scan-tree-dump-times "recursive call" 1 "profile_estimate"} } */
diff --git a/gcc/testsuite/gcc.dg/predict-12.c b/gcc/testsuite/gcc.dg/predict-12.c
index 1fd4d67c60e..eb5178e7a2e 100644
--- a/gcc/testsuite/gcc.dg/predict-12.c
+++ b/gcc/testsuite/gcc.dg/predict-12.c
@@ -14,4 +14,3 @@ t(void)
 }
 /* { dg-final { scan-tree-dump-times "loop guard with recursion" 1 "profile_estimate"} } */
 /* { dg-final { scan-tree-dump-times "loop exit with recursion" 2 "profile_estimate"} } */
-/* { dg-final { scan-tree-dump-times "recursive call" 1 "profile_estimate"} } */
diff --git a/gcc/testsuite/gcc.dg/predict-3.c b/gcc/testsuite/gcc.dg/predict-3.c
index 7274963b943..81addde1667 100644
--- a/gcc/testsuite/gcc.dg/predict-3.c
+++ b/gcc/testsuite/gcc.dg/predict-3.c
@@ -25,4 +25,4 @@ void foo (int bound)
     }
 }
 
-/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 98.0%" 3 "profile_estimate"} } */
+/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 64.0%" 3 "profile_estimate"} } */
diff --git a/gcc/testsuite/gcc.dg/predict-5.c b/gcc/testsuite/gcc.dg/predict-5.c
index 135081de2a4..c80b2928d57 100644
--- a/gcc/testsuite/gcc.dg/predict-5.c
+++ b/gcc/testsuite/gcc.dg/predict-5.c
@@ -21,4 +21,4 @@ void foo (int base, int bound)
     }
 }
 
-/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 98.0%" 4 "profile_estimate"} } */
+/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 64.0%" 4 "profile_estimate"} } */
diff --git a/gcc/testsuite/gcc.dg/predict-6.c b/gcc/testsuite/gcc.dg/predict-6.c
index 104683f7f43..3acc7644629 100644
--- a/gcc/testsuite/gcc.dg/predict-6.c
+++ b/gcc/testsuite/gcc.dg/predict-6.c
@@ -21,4 +21,4 @@ void foo (int base, int bound)
     }
 }
 
-/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 2.0%" 4 "profile_estimate"} } */
+/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 36.0%" 4 "profile_estimate"} } */
diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
index ec467519504..3823775e3f8 100644
--- a/gcc/testsuite/gcc.dg/predict-9.c
+++ b/gcc/testsuite/gcc.dg/predict-9.c
@@ -19,5 +19,5 @@ void foo (int base)
   }
 }
 
-/* { dg-final { scan-tree-dump-times "first match heuristics: 3.0%" 3 "profile_estimate"} } */
-/* { dg-final { scan-tree-dump-times "first match heuristics: 7.5%" 1 "profile_estimate"} } */
+/* { dg-final { scan-tree-dump-times "first match heuristics: 2.2%" 3 "profile_estimate"} } */
+/* { dg-final { scan-tree-dump-times "first match heuristics: 5.5%" 1 "profile_estimate"} } */
-- 
2.14.3


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] Adjust predictor values according to SPEC2006 and, SPEC2017.
  2018-01-09 11:40 ` [PATCH 3/5] Adjust predictor values according to SPEC2006 and, SPEC2017 Martin Liška
@ 2018-01-09 15:12   ` Jan Hubicka
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Hubicka @ 2018-01-09 15:12 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches

> Third part changes predictors values:
> 
> 1) PRED_LOOP_EXIT: no dominant branch; value simply taken from statistics
> 2) PRED_LOOP_EXIT_WITH_RECURSION: there are 4 dominant edges; value taken w/o these edges
> 3) PRED_LOOP_EXTRA_EXIT: there's one really dominant edge; value taken w/o the edge; note that
> coverage of the predictor is quite small
> 4) PRED_OPCODE_POSITIVE: no dominant branch; value simply taken from statistics
> 5) PRED_CONST_RETURN: there are 4 dominant edges, value taken w/o these edges
> 6) PRED_NULL_RETURN: one dominant edge, value taken w/o these edges
> 7) PRED_LOOP_IV_COMPARE_GUESS: very fragile predictor (according how is identified in predict.c);
> has a dominant edge; value taken w/o these edges; in future I plan to investigate it
> 8) PRED_LOOP_GUARD: adjusted based on numbers without a one dominant edge
> 
> Martin

> >From 960f16a6e3e916524d8881f53913c15a3c2ec2ae Mon Sep 17 00:00:00 2001
> From: marxin <mliska@suse.cz>
> Date: Thu, 28 Dec 2017 10:13:50 +0100
> Subject: [PATCH 3/5] Adjust predictor values according to SPEC2006 and
>  SPEC2017.
> 
> gcc/ChangeLog:
> 
> 2017-12-28  Martin Liska  <mliska@suse.cz>
> 
> 	* predict.def (PRED_LOOP_EXIT): Change from 85 to 89.
> 	(PRED_LOOP_EXIT_WITH_RECURSION): Change from 72 to 78.
> 	(PRED_LOOP_EXTRA_EXIT): Change from 83 to 67.
> 	(PRED_OPCODE_POSITIVE): Change from 64 to 59.
> 	(PRED_TREE_OPCODE_POSITIVE): Change from 64 to 59.
> 	(PRED_CONST_RETURN): Change from 69 to 65.
> 	(PRED_NULL_RETURN): Change from 91 to 71.
> 	(PRED_LOOP_IV_COMPARE_GUESS): Change from 98 to 64.
> 	(PRED_LOOP_GUARD): Change from 66 to 73.

OK,
please wait with commit for tomorrow as i have comitted today the fix to ipa-inline
I would like to hit nightly testers independently.

Thanks!
Honza
> ---
>  gcc/predict.def | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/predict.def b/gcc/predict.def
> index 390b9a35fa7..fe72080d5bd 100644
> --- a/gcc/predict.def
> +++ b/gcc/predict.def
> @@ -89,16 +89,16 @@ DEF_PREDICTOR (PRED_COLD_FUNCTION, "cold function call", PROB_VERY_LIKELY,
>  	       PRED_FLAG_FIRST_MATCH)
>  
>  /* Edge causing loop to terminate is probably not taken.  */
> -DEF_PREDICTOR (PRED_LOOP_EXIT, "loop exit", HITRATE (85),
> +DEF_PREDICTOR (PRED_LOOP_EXIT, "loop exit", HITRATE (89),
>  	       PRED_FLAG_FIRST_MATCH)
>  
>  /* Same as LOOP_EXIT but for loops containing recursive call.  */
>  DEF_PREDICTOR (PRED_LOOP_EXIT_WITH_RECURSION, "loop exit with recursion",
> -	       HITRATE (72), PRED_FLAG_FIRST_MATCH)
> +	       HITRATE (78), PRED_FLAG_FIRST_MATCH)
>  
>  /* Edge causing loop to terminate by computing value used by later
>     conditional.  */
> -DEF_PREDICTOR (PRED_LOOP_EXTRA_EXIT, "extra loop exit", HITRATE (83),
> +DEF_PREDICTOR (PRED_LOOP_EXTRA_EXIT, "extra loop exit", HITRATE (67),
>  	       PRED_FLAG_FIRST_MATCH)
>  
>  /* Pointers are usually not NULL.  */
> @@ -106,11 +106,11 @@ DEF_PREDICTOR (PRED_POINTER, "pointer", HITRATE (70), 0)
>  DEF_PREDICTOR (PRED_TREE_POINTER, "pointer (on trees)", HITRATE (70), 0)
>  
>  /* NE is probable, EQ not etc...  */
> -DEF_PREDICTOR (PRED_OPCODE_POSITIVE, "opcode values positive", HITRATE (64), 0)
> +DEF_PREDICTOR (PRED_OPCODE_POSITIVE, "opcode values positive", HITRATE (59), 0)
>  DEF_PREDICTOR (PRED_OPCODE_NONEQUAL, "opcode values nonequal", HITRATE (66), 0)
>  DEF_PREDICTOR (PRED_FPOPCODE, "fp_opcode", HITRATE (90), 0)
>  DEF_PREDICTOR (PRED_TREE_OPCODE_POSITIVE, "opcode values positive (on trees)",
> -	       HITRATE (64), 0)
> +	       HITRATE (59), 0)
>  DEF_PREDICTOR (PRED_TREE_OPCODE_NONEQUAL, "opcode values nonequal (on trees)",
>  	       HITRATE (66), 0)
>  DEF_PREDICTOR (PRED_TREE_FPOPCODE, "fp_opcode (on trees)", HITRATE (90), 0)
> @@ -136,18 +136,18 @@ DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
>  DEF_PREDICTOR (PRED_GOTO, "goto", HITRATE (66), 0)
>  
>  /* Branch ending with return constant is probably not taken.  */
> -DEF_PREDICTOR (PRED_CONST_RETURN, "const return", HITRATE (69), 0)
> +DEF_PREDICTOR (PRED_CONST_RETURN, "const return", HITRATE (65), 0)
>  
>  /* Branch ending with return negative constant is probably not taken.  */
>  DEF_PREDICTOR (PRED_NEGATIVE_RETURN, "negative return", HITRATE (98), 0)
>  
>  /* Branch ending with return; is probably not taken */
> -DEF_PREDICTOR (PRED_NULL_RETURN, "null return", HITRATE (91), 0)
> +DEF_PREDICTOR (PRED_NULL_RETURN, "null return", HITRATE (71), 0)
>  
>  /* Branches to compare induction variable to a loop bound is
>     extremely likely.  */
>  DEF_PREDICTOR (PRED_LOOP_IV_COMPARE_GUESS, "guess loop iv compare",
> -	       HITRATE (98), 0)
> +	       HITRATE (64), 0)
>  
>  /* Use number of loop iterations determined by # of iterations analysis
>     to set probability of branches that compares IV to loop bound variable.  */
> @@ -160,7 +160,7 @@ DEF_PREDICTOR (PRED_LOOP_IV_COMPARE, "loop iv compare", PROB_UNINITIALIZED,
>         for (loop2)
>  	 body;
>     guess that cond is unlikely.  */
> -DEF_PREDICTOR (PRED_LOOP_GUARD, "loop guard", HITRATE (66), 0)
> +DEF_PREDICTOR (PRED_LOOP_GUARD, "loop guard", HITRATE (73), 0)
>  
>  /* Same but for loops containing recursion.  */
>  DEF_PREDICTOR (PRED_LOOP_GUARD_WITH_RECURSION, "loop guard with recursion",
> -- 
> 2.14.3
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] Remove predictors that are unrealiable.
  2018-01-09 11:46 ` [PATCH 4/5] Remove predictors that are unrealiable Martin Liška
@ 2018-01-11 10:41   ` Jan Hubicka
  2018-01-19 13:21     ` Martin Liška
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Hubicka @ 2018-01-11 10:41 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches

> These predictors are in my opinion not reliable and thus I decided to remove them:
> 
> 1) PRED_NEGATIVE_RETURN: probability is ~51%
> 2) PRED_RECURSIVE_CALL: there are 2 dominant edges that influence value to 63%;
> w/o these edges it goes down to 52%
> 3) PRED_POLYMORPHIC_CALL: having very low coverage, probability is ~51%
> 4) PRED_INDIR_CALL: likewise
> 
> Question here is whether we want to remove them, or to predict them with a 'ignored'
> flag? Doing that, we can measure statistics of the predictor in the future?

I believe that recursive call was introudced to help exchange2 benchmark.  I think it does
make sense globally because function simply can not contain hot self recursive call and thus
I would not care about low benchmark coverage.

For polymorphic/indir call I think they are going to grow in importance in future
(especialy first one) and thus I would like to keep them tracked.  If you simply
set probablility to PROB_EVEN, they won't affect branch prediction outcome and
we still will get data on how common they are.

For PRED_NAGATIVE_RETURN, can you take a look why it changed from 98 to 51%?
The idea is that negative values are often used to report error codes and that seems
reasonable.  Perhaps it can be made more specific so it remains working ofr spec2k16?

Honza
> 
> Martin

> >From afbc86cb72eab37bcf6325954d0bf306b301f76e Mon Sep 17 00:00:00 2001
> From: marxin <mliska@suse.cz>
> Date: Thu, 28 Dec 2017 10:23:48 +0100
> Subject: [PATCH 4/5] Remove predictors that are unrealiable.
> 
> gcc/ChangeLog:
> 
> 2017-12-28  Martin Liska  <mliska@suse.cz>
> 
> 	* predict.c (return_prediction): Do not predict
> 	PRED_NEGATIVE_RETURN.
> 	(tree_bb_level_predictions): Do not predict PRED_RECURSIVE_CALL.
> 	(tree_estimate_probability_bb): Do not predict
> 	PRED_POLYMORPHIC_CALL and PRED_INDIR_CALL.
> 	* predict.def (PRED_INDIR_CALL): Remove unused predictors.
> 	(PRED_POLYMORPHIC_CALL): Likewise.
> 	(PRED_RECURSIVE_CALL): Likewise.
> 	(PRED_NEGATIVE_RETURN): Likewise.
> ---
>  gcc/predict.c   | 17 ++---------------
>  gcc/predict.def | 13 -------------
>  2 files changed, 2 insertions(+), 28 deletions(-)
> 
> diff --git a/gcc/predict.c b/gcc/predict.c
> index 51fd14205c2..f53724792e9 100644
> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -2632,14 +2632,6 @@ return_prediction (tree val, enum prediction *prediction)
>      }
>    else if (INTEGRAL_TYPE_P (TREE_TYPE (val)))
>      {
> -      /* Negative return values are often used to indicate
> -         errors.  */
> -      if (TREE_CODE (val) == INTEGER_CST
> -	  && tree_int_cst_sgn (val) < 0)
> -	{
> -	  *prediction = NOT_TAKEN;
> -	  return PRED_NEGATIVE_RETURN;
> -	}
>        /* Constant return values seems to be commonly taken.
>           Zero/one often represent booleans so exclude them from the
>  	 heuristics.  */
> @@ -2820,9 +2812,6 @@ tree_bb_level_predictions (void)
>  				       DECL_ATTRIBUTES (decl)))
>  		predict_paths_leading_to (bb, PRED_COLD_FUNCTION,
>  					  NOT_TAKEN);
> -	      if (decl && recursive_call_p (current_function_decl, decl))
> -		predict_paths_leading_to (bb, PRED_RECURSIVE_CALL,
> -					  NOT_TAKEN);
>  	    }
>  	  else if (gimple_code (stmt) == GIMPLE_PREDICT)
>  	    {
> @@ -2880,12 +2869,10 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
>  		     something exceptional.  */
>  		  && gimple_has_side_effects (stmt))
>  		{
> +		  /* Consider just normal function calls, skip indirect and
> +		  polymorphic calls as these tend to be unreliable.  */
>  		  if (gimple_call_fndecl (stmt))
>  		    predict_edge_def (e, PRED_CALL, NOT_TAKEN);
> -		  else if (virtual_method_call_p (gimple_call_fn (stmt)))
> -		    predict_edge_def (e, PRED_POLYMORPHIC_CALL, NOT_TAKEN);
> -		  else
> -		    predict_edge_def (e, PRED_INDIR_CALL, TAKEN);
>  		  break;
>  		}
>  	    }
> diff --git a/gcc/predict.def b/gcc/predict.def
> index fe72080d5bd..7291650d237 100644
> --- a/gcc/predict.def
> +++ b/gcc/predict.def
> @@ -118,16 +118,6 @@ DEF_PREDICTOR (PRED_TREE_FPOPCODE, "fp_opcode (on trees)", HITRATE (90), 0)
>  /* Branch guarding call is probably taken.  */
>  DEF_PREDICTOR (PRED_CALL, "call", HITRATE (67), 0)
>  
> -/* PRED_CALL is not very reliable predictor and it turns out to be even
> -   less reliable for indirect calls and polymorphic calls.  For spec2k6
> -   the predictio nis slightly in the direction of taking the call.  */
> -DEF_PREDICTOR (PRED_INDIR_CALL, "indirect call", HITRATE (86), 0)
> -DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic call", HITRATE (59), 0)
> -
> -/* Recursive calls are usually not taken or the function will recurse
> -   indefinitely.  */
> -DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0)
> -
>  /* Branch causing function to terminate is probably not taken.  */
>  DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
>  	       0)
> @@ -138,9 +128,6 @@ DEF_PREDICTOR (PRED_GOTO, "goto", HITRATE (66), 0)
>  /* Branch ending with return constant is probably not taken.  */
>  DEF_PREDICTOR (PRED_CONST_RETURN, "const return", HITRATE (65), 0)
>  
> -/* Branch ending with return negative constant is probably not taken.  */
> -DEF_PREDICTOR (PRED_NEGATIVE_RETURN, "negative return", HITRATE (98), 0)
> -
>  /* Branch ending with return; is probably not taken */
>  DEF_PREDICTOR (PRED_NULL_RETURN, "null return", HITRATE (71), 0)
>  
> -- 
> 2.14.3
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] Introduce PROB_UNINITIALIZED constant and use it in, predict.def.
  2018-01-09 11:30 ` [PATCH 2/5] Introduce PROB_UNINITIALIZED constant and use it in, predict.def Martin Liška
@ 2018-01-11 10:44   ` Jan Hubicka
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Hubicka @ 2018-01-11 10:44 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches

> Second patch cleans up predictors that do not use a value from predict.def,
> but their value is derived from e.g. number of iterations of a loop.
> These predictors have set probability to PROB_UNINITIALIZED and analyze_branch.py
> does not compare statistics to values defined in predict.def.
> 
> The patch is no-op.

OK, thanks!
Honza
> 
> Martin

> >From 6f6f2d88d3141b1e1604698abf857fffbb42330e Mon Sep 17 00:00:00 2001
> From: marxin <mliska@suse.cz>
> Date: Wed, 27 Dec 2017 14:49:20 +0100
> Subject: [PATCH 2/5] Introduce PROB_UNINITIALIZED constant and use it in
>  predict.def.
> 
> gcc/ChangeLog:
> 
> 2017-12-28  Martin Liska  <mliska@suse.cz>
> 
> 	* predict.c (predict_insn_def): Add new assert.
> 	(struct branch_predictor): Change type to signed integer.
> 	(test_prediction_value_range): Amend test to cover
> 	PROB_UNINITIALIZED.
> 	* predict.def (PRED_LOOP_ITERATIONS): Use the new constant.
> 	(PRED_LOOP_ITERATIONS_GUESSED): Likewise.
> 	(PRED_LOOP_ITERATIONS_MAX): Likewise.
> 	(PRED_LOOP_IV_COMPARE): Likewise.
> 	* predict.h (PROB_UNINITIALIZED): Define new constant.
> ---
>  gcc/predict.c   | 6 +++++-
>  gcc/predict.def | 8 ++++----
>  gcc/predict.h   | 1 +
>  3 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/predict.c b/gcc/predict.c
> index 3ac18a2c5f2..51fd14205c2 100644
> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -545,6 +545,7 @@ predict_insn_def (rtx_insn *insn, enum br_predictor predictor,
>  		  enum prediction taken)
>  {
>     int probability = predictor_info[(int) predictor].hitrate;
> +   gcc_assert (probability != PROB_UNINITIALIZED);
>  
>     if (taken != TAKEN)
>       probability = REG_BR_PROB_BASE - probability;
> @@ -4185,7 +4186,7 @@ namespace selftest {
>  struct branch_predictor
>  {
>    const char *name;
> -  unsigned probability;
> +  int probability;
>  };
>  
>  #define DEF_PREDICTOR(ENUM, NAME, HITRATE, FLAGS) { NAME, HITRATE },
> @@ -4200,6 +4201,9 @@ test_prediction_value_range ()
>  
>    for (unsigned i = 0; predictors[i].name != NULL; i++)
>      {
> +      if (predictors[i].probability == PROB_UNINITIALIZED)
> +	continue;
> +
>        unsigned p = 100 * predictors[i].probability / REG_BR_PROB_BASE;
>        ASSERT_TRUE (p > 50 && p <= 100);
>      }
> diff --git a/gcc/predict.def b/gcc/predict.def
> index 0f37e399312..390b9a35fa7 100644
> --- a/gcc/predict.def
> +++ b/gcc/predict.def
> @@ -54,7 +54,7 @@ DEF_PREDICTOR (PRED_UNCONDITIONAL, "unconditional jump", PROB_ALWAYS,
>  /* Use number of loop iterations determined by # of iterations
>     analysis to set probability.  We don't want to use Dempster-Shaffer
>     theory here, as the predictions is exact.  */
> -DEF_PREDICTOR (PRED_LOOP_ITERATIONS, "loop iterations", PROB_ALWAYS,
> +DEF_PREDICTOR (PRED_LOOP_ITERATIONS, "loop iterations", PROB_UNINITIALIZED,
>  	       PRED_FLAG_FIRST_MATCH)
>  
>  /* Assume that any given atomic operation has low contention,
> @@ -71,11 +71,11 @@ DEF_PREDICTOR (PRED_BUILTIN_EXPECT, "__builtin_expect", PROB_VERY_LIKELY,
>  
>  /* Use number of loop iterations guessed by the contents of the loop.  */
>  DEF_PREDICTOR (PRED_LOOP_ITERATIONS_GUESSED, "guessed loop iterations",
> -	       PROB_ALWAYS, PRED_FLAG_FIRST_MATCH)
> +	       PROB_UNINITIALIZED, PRED_FLAG_FIRST_MATCH)
>  
>  /* Use number of loop iterations guessed by the contents of the loop.  */
>  DEF_PREDICTOR (PRED_LOOP_ITERATIONS_MAX, "guessed loop iterations",
> -	       PROB_ALWAYS, PRED_FLAG_FIRST_MATCH)
> +	       PROB_UNINITIALIZED, PRED_FLAG_FIRST_MATCH)
>  
>  /* Branch containing goto is probably not taken.  */
>  DEF_PREDICTOR (PRED_CONTINUE, "continue", HITRATE (67), 0)
> @@ -151,7 +151,7 @@ DEF_PREDICTOR (PRED_LOOP_IV_COMPARE_GUESS, "guess loop iv compare",
>  
>  /* Use number of loop iterations determined by # of iterations analysis
>     to set probability of branches that compares IV to loop bound variable.  */
> -DEF_PREDICTOR (PRED_LOOP_IV_COMPARE, "loop iv compare", PROB_VERY_LIKELY,
> +DEF_PREDICTOR (PRED_LOOP_IV_COMPARE, "loop iv compare", PROB_UNINITIALIZED,
>  	       PRED_FLAG_FIRST_MATCH)
>  
>  /* In the following code
> diff --git a/gcc/predict.h b/gcc/predict.h
> index 57715159b95..e4d1da090ca 100644
> --- a/gcc/predict.h
> +++ b/gcc/predict.h
> @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
>  #define PROB_ALWAYS		(REG_BR_PROB_BASE)
>  #define PROB_UNLIKELY           (REG_BR_PROB_BASE / 5 - 1)
>  #define PROB_LIKELY             (REG_BR_PROB_BASE - PROB_UNLIKELY)
> +#define PROB_UNINITIALIZED      (-1)
>  
>  #define DEF_PREDICTOR(ENUM, NAME, HITRATE, FLAGS) ENUM,
>  enum br_predictor
> -- 
> 2.14.3
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] Fix test-suite fallout.
  2018-01-09 11:48 ` [PATCH 5/5] Fix test-suite fallout Martin Liška
@ 2018-01-12  9:13   ` Jan Hubicka
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Hubicka @ 2018-01-12  9:13 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches

> Last patch from the series it clean up of test-suite.
> 
> Martin

> >From e76a4544e236d7d586380f33e20431f854436ce4 Mon Sep 17 00:00:00 2001
> From: marxin <mliska@suse.cz>
> Date: Tue, 9 Jan 2018 10:52:36 +0100
> Subject: [PATCH 5/5] Fix test-suite fallout.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-01-09  Martin Liska  <mliska@suse.cz>
> 
> 	* gcc.dg/predict-1.c: Adjust expected probability.
> 	* gcc.dg/predict-10.c: Remove.
> 	* gcc.dg/predict-12.c: Do not scan for removed predictor.
> 	* gcc.dg/predict-3.c: Adjust expected probability.
> 	* gcc.dg/predict-5.c: Likewise.
> 	* gcc.dg/predict-6.c: Likewise.
> 	* gcc.dg/predict-9.c: Likewise.

OK,
thanks!
Honza
> ---
>  gcc/testsuite/gcc.dg/predict-1.c  |  2 +-
>  gcc/testsuite/gcc.dg/predict-10.c | 11 -----------
>  gcc/testsuite/gcc.dg/predict-12.c |  1 -
>  gcc/testsuite/gcc.dg/predict-3.c  |  2 +-
>  gcc/testsuite/gcc.dg/predict-5.c  |  2 +-
>  gcc/testsuite/gcc.dg/predict-6.c  |  2 +-
>  gcc/testsuite/gcc.dg/predict-9.c  |  4 ++--
>  7 files changed, 6 insertions(+), 18 deletions(-)
>  delete mode 100644 gcc/testsuite/gcc.dg/predict-10.c
> 
> diff --git a/gcc/testsuite/gcc.dg/predict-1.c b/gcc/testsuite/gcc.dg/predict-1.c
> index 65f6bad9d7c..4ba26e6e256 100644
> --- a/gcc/testsuite/gcc.dg/predict-1.c
> +++ b/gcc/testsuite/gcc.dg/predict-1.c
> @@ -23,4 +23,4 @@ void foo (int bound)
>      }
>  }
>  
> -/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 2.0%" 4 "profile_estimate"} } */
> +/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 36.0%" 4 "profile_estimate"} } */
> diff --git a/gcc/testsuite/gcc.dg/predict-10.c b/gcc/testsuite/gcc.dg/predict-10.c
> deleted file mode 100644
> index a99819a24c6..00000000000
> --- a/gcc/testsuite/gcc.dg/predict-10.c
> +++ /dev/null
> @@ -1,11 +0,0 @@
> -/* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
> -int
> -ee(int i)
> -{
> -  if (i>2)
> -    return (ee(i-1)+ee(i-2))/2;
> -  else
> -    return i;
> -}
> -/* { dg-final { scan-tree-dump-times "recursive call" 1 "profile_estimate"} } */
> diff --git a/gcc/testsuite/gcc.dg/predict-12.c b/gcc/testsuite/gcc.dg/predict-12.c
> index 1fd4d67c60e..eb5178e7a2e 100644
> --- a/gcc/testsuite/gcc.dg/predict-12.c
> +++ b/gcc/testsuite/gcc.dg/predict-12.c
> @@ -14,4 +14,3 @@ t(void)
>  }
>  /* { dg-final { scan-tree-dump-times "loop guard with recursion" 1 "profile_estimate"} } */
>  /* { dg-final { scan-tree-dump-times "loop exit with recursion" 2 "profile_estimate"} } */
> -/* { dg-final { scan-tree-dump-times "recursive call" 1 "profile_estimate"} } */
> diff --git a/gcc/testsuite/gcc.dg/predict-3.c b/gcc/testsuite/gcc.dg/predict-3.c
> index 7274963b943..81addde1667 100644
> --- a/gcc/testsuite/gcc.dg/predict-3.c
> +++ b/gcc/testsuite/gcc.dg/predict-3.c
> @@ -25,4 +25,4 @@ void foo (int bound)
>      }
>  }
>  
> -/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 98.0%" 3 "profile_estimate"} } */
> +/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 64.0%" 3 "profile_estimate"} } */
> diff --git a/gcc/testsuite/gcc.dg/predict-5.c b/gcc/testsuite/gcc.dg/predict-5.c
> index 135081de2a4..c80b2928d57 100644
> --- a/gcc/testsuite/gcc.dg/predict-5.c
> +++ b/gcc/testsuite/gcc.dg/predict-5.c
> @@ -21,4 +21,4 @@ void foo (int base, int bound)
>      }
>  }
>  
> -/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 98.0%" 4 "profile_estimate"} } */
> +/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 64.0%" 4 "profile_estimate"} } */
> diff --git a/gcc/testsuite/gcc.dg/predict-6.c b/gcc/testsuite/gcc.dg/predict-6.c
> index 104683f7f43..3acc7644629 100644
> --- a/gcc/testsuite/gcc.dg/predict-6.c
> +++ b/gcc/testsuite/gcc.dg/predict-6.c
> @@ -21,4 +21,4 @@ void foo (int base, int bound)
>      }
>  }
>  
> -/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 2.0%" 4 "profile_estimate"} } */
> +/* { dg-final { scan-tree-dump-times "guess loop iv compare heuristics of edge\[^:\]*: 36.0%" 4 "profile_estimate"} } */
> diff --git a/gcc/testsuite/gcc.dg/predict-9.c b/gcc/testsuite/gcc.dg/predict-9.c
> index ec467519504..3823775e3f8 100644
> --- a/gcc/testsuite/gcc.dg/predict-9.c
> +++ b/gcc/testsuite/gcc.dg/predict-9.c
> @@ -19,5 +19,5 @@ void foo (int base)
>    }
>  }
>  
> -/* { dg-final { scan-tree-dump-times "first match heuristics: 3.0%" 3 "profile_estimate"} } */
> -/* { dg-final { scan-tree-dump-times "first match heuristics: 7.5%" 1 "profile_estimate"} } */
> +/* { dg-final { scan-tree-dump-times "first match heuristics: 2.2%" 3 "profile_estimate"} } */
> +/* { dg-final { scan-tree-dump-times "first match heuristics: 5.5%" 1 "profile_estimate"} } */
> -- 
> 2.14.3
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] Remove predictors that are unrealiable.
  2018-01-11 10:41   ` Jan Hubicka
@ 2018-01-19 13:21     ` Martin Liška
  2018-01-22 14:44       ` Martin Liška
  0 siblings, 1 reply; 15+ messages in thread
From: Martin Liška @ 2018-01-19 13:21 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: GCC Patches

On 01/11/2018 11:39 AM, Jan Hubicka wrote:
>> These predictors are in my opinion not reliable and thus I decided to remove them:
>>
>> 1) PRED_NEGATIVE_RETURN: probability is ~51%
>> 2) PRED_RECURSIVE_CALL: there are 2 dominant edges that influence value to 63%;
>> w/o these edges it goes down to 52%
>> 3) PRED_POLYMORPHIC_CALL: having very low coverage, probability is ~51%
>> 4) PRED_INDIR_CALL: likewise
>>
>> Question here is whether we want to remove them, or to predict them with a 'ignored'
>> flag? Doing that, we can measure statistics of the predictor in the future?
> 
> I believe that recursive call was introudced to help exchange2 benchmark.  I think it does
> make sense globally because function simply can not contain hot self recursive call and thus
> I would not care about low benchmark coverage.

Yes it probably was. However according to numbers I have it does not influence the jumpy
benchmarks ;)

> 
> For polymorphic/indir call I think they are going to grow in importance in future
> (especialy first one) and thus I would like to keep them tracked.  If you simply
> set probablility to PROB_EVEN, they won't affect branch prediction outcome and
> we still will get data on how common they are.

Sure, let's make then PROB_EVEN.

> 
> For PRED_NAGATIVE_RETURN, can you take a look why it changed from 98 to 51%?
> The idea is that negative values are often used to report error codes and that seems
> reasonable.  Perhaps it can be made more specific so it remains working ofr spec2k16?

Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ dominant edges,
and later one only 11%. It's caused by these 2 benchmarks with a high coverage:

500.perlbench_r: regexec.c.065i.profile:
  negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 2429863555 (98.1%)

and 523.xalancbmk_r:
build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 (100.0%)

Ideas what to do with the predictor for GCC 8 release?
Martin

> 
> Honza
>>
>> Martin
> 
>> >From afbc86cb72eab37bcf6325954d0bf306b301f76e Mon Sep 17 00:00:00 2001
>> From: marxin <mliska@suse.cz>
>> Date: Thu, 28 Dec 2017 10:23:48 +0100
>> Subject: [PATCH 4/5] Remove predictors that are unrealiable.
>>
>> gcc/ChangeLog:
>>
>> 2017-12-28  Martin Liska  <mliska@suse.cz>
>>
>> 	* predict.c (return_prediction): Do not predict
>> 	PRED_NEGATIVE_RETURN.
>> 	(tree_bb_level_predictions): Do not predict PRED_RECURSIVE_CALL.
>> 	(tree_estimate_probability_bb): Do not predict
>> 	PRED_POLYMORPHIC_CALL and PRED_INDIR_CALL.
>> 	* predict.def (PRED_INDIR_CALL): Remove unused predictors.
>> 	(PRED_POLYMORPHIC_CALL): Likewise.
>> 	(PRED_RECURSIVE_CALL): Likewise.
>> 	(PRED_NEGATIVE_RETURN): Likewise.
>> ---
>>  gcc/predict.c   | 17 ++---------------
>>  gcc/predict.def | 13 -------------
>>  2 files changed, 2 insertions(+), 28 deletions(-)
>>
>> diff --git a/gcc/predict.c b/gcc/predict.c
>> index 51fd14205c2..f53724792e9 100644
>> --- a/gcc/predict.c
>> +++ b/gcc/predict.c
>> @@ -2632,14 +2632,6 @@ return_prediction (tree val, enum prediction *prediction)
>>      }
>>    else if (INTEGRAL_TYPE_P (TREE_TYPE (val)))
>>      {
>> -      /* Negative return values are often used to indicate
>> -         errors.  */
>> -      if (TREE_CODE (val) == INTEGER_CST
>> -	  && tree_int_cst_sgn (val) < 0)
>> -	{
>> -	  *prediction = NOT_TAKEN;
>> -	  return PRED_NEGATIVE_RETURN;
>> -	}
>>        /* Constant return values seems to be commonly taken.
>>           Zero/one often represent booleans so exclude them from the
>>  	 heuristics.  */
>> @@ -2820,9 +2812,6 @@ tree_bb_level_predictions (void)
>>  				       DECL_ATTRIBUTES (decl)))
>>  		predict_paths_leading_to (bb, PRED_COLD_FUNCTION,
>>  					  NOT_TAKEN);
>> -	      if (decl && recursive_call_p (current_function_decl, decl))
>> -		predict_paths_leading_to (bb, PRED_RECURSIVE_CALL,
>> -					  NOT_TAKEN);
>>  	    }
>>  	  else if (gimple_code (stmt) == GIMPLE_PREDICT)
>>  	    {
>> @@ -2880,12 +2869,10 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
>>  		     something exceptional.  */
>>  		  && gimple_has_side_effects (stmt))
>>  		{
>> +		  /* Consider just normal function calls, skip indirect and
>> +		  polymorphic calls as these tend to be unreliable.  */
>>  		  if (gimple_call_fndecl (stmt))
>>  		    predict_edge_def (e, PRED_CALL, NOT_TAKEN);
>> -		  else if (virtual_method_call_p (gimple_call_fn (stmt)))
>> -		    predict_edge_def (e, PRED_POLYMORPHIC_CALL, NOT_TAKEN);
>> -		  else
>> -		    predict_edge_def (e, PRED_INDIR_CALL, TAKEN);
>>  		  break;
>>  		}
>>  	    }
>> diff --git a/gcc/predict.def b/gcc/predict.def
>> index fe72080d5bd..7291650d237 100644
>> --- a/gcc/predict.def
>> +++ b/gcc/predict.def
>> @@ -118,16 +118,6 @@ DEF_PREDICTOR (PRED_TREE_FPOPCODE, "fp_opcode (on trees)", HITRATE (90), 0)
>>  /* Branch guarding call is probably taken.  */
>>  DEF_PREDICTOR (PRED_CALL, "call", HITRATE (67), 0)
>>  
>> -/* PRED_CALL is not very reliable predictor and it turns out to be even
>> -   less reliable for indirect calls and polymorphic calls.  For spec2k6
>> -   the predictio nis slightly in the direction of taking the call.  */
>> -DEF_PREDICTOR (PRED_INDIR_CALL, "indirect call", HITRATE (86), 0)
>> -DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic call", HITRATE (59), 0)
>> -
>> -/* Recursive calls are usually not taken or the function will recurse
>> -   indefinitely.  */
>> -DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0)
>> -
>>  /* Branch causing function to terminate is probably not taken.  */
>>  DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
>>  	       0)
>> @@ -138,9 +128,6 @@ DEF_PREDICTOR (PRED_GOTO, "goto", HITRATE (66), 0)
>>  /* Branch ending with return constant is probably not taken.  */
>>  DEF_PREDICTOR (PRED_CONST_RETURN, "const return", HITRATE (65), 0)
>>  
>> -/* Branch ending with return negative constant is probably not taken.  */
>> -DEF_PREDICTOR (PRED_NEGATIVE_RETURN, "negative return", HITRATE (98), 0)
>> -
>>  /* Branch ending with return; is probably not taken */
>>  DEF_PREDICTOR (PRED_NULL_RETURN, "null return", HITRATE (71), 0)
>>  
>> -- 
>> 2.14.3
>>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] Remove predictors that are unrealiable.
  2018-01-19 13:21     ` Martin Liška
@ 2018-01-22 14:44       ` Martin Liška
  2018-01-23 10:04         ` Jan Hubicka
  0 siblings, 1 reply; 15+ messages in thread
From: Martin Liška @ 2018-01-22 14:44 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: GCC Patches

On 01/19/2018 12:57 PM, Martin Liška wrote:
> Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ dominant edges,
> and later one only 11%. It's caused by these 2 benchmarks with a high coverage:
> 

Hi.

I'm sending details about the 2 edges that influence the statistics significantly:

> 500.perlbench_r: regexec.c.065i.profile:
>   negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 2429863555 (98.1%)

2477714850: 3484:S_regtry(pTHX_ regmatch_info *reginfo, char **startposp)
        -: 3485:{
2477714850: 3486:    CHECKPOINT lastcp;
2477714850: 3487:    REGEXP *const rx = reginfo->prog;
2477714850: 3488:    regexp *const prog = ReANY(rx);
2477714850: 3489:    SSize_t result;
[snip]
2477714850: 8046:    assert(!result ||  locinput - reginfo->strbeg >= 0);
2477714850: 8047:    return result ?  locinput - reginfo->strbeg : -1;
        -:  8048:}

As seen it return -1 if a regex is not found, which is in case of perlbench very likely branch.

> 
> and 523.xalancbmk_r:
> build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 (100.0%)

1221735072:   74:int NameDatatypeValidator::compare(const XMLCh* const lValue
        -:   75:                                   , const XMLCh* const rValue
        -:   76:                                   ,       MemoryManager*     const)
        -:   77:{
1221735072:   78:    return ( XMLString::equals(lValue, rValue)? 0 : -1);
        -:   79:}
        -:   80:

IP profile dump file:

xercesc_2_7::NameDatatypeValidator::compare (struct NameDatatypeValidator * const this, const XMLCh * const lValue, const XMLCh * const rValue, struct MemoryManager * const D.17157)
{
  bool _1;
  int iftmp.0_2;

  <bb 2> [count: 1221735072]:
  # DEBUG BEGIN_STMT
  _1 = xercesc_2_7::XMLString::equals (lValue_4(D), rValue_5(D));
  if (_1 != 0)
    goto <bb 4>; [0.02%]
  else
    goto <bb 3>; [99.98%]

  <bb 3> [count: 1221522453]:

  <bb 4> [count: 1221735072]:
  # iftmp.0_2 = PHI <0(2), -1(3)>
  return iftmp.0_2;

}

Likewise, XML values are more commonly non-equal.
Ok, so may I mark also negative return with PROB_EVEN to track it?

Thanks,
Martin

> 
> Ideas what to do with the predictor for GCC 8 release?
> Martin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] Remove predictors that are unrealiable.
  2018-01-22 14:44       ` Martin Liška
@ 2018-01-23 10:04         ` Jan Hubicka
  2018-01-23 12:20           ` Martin Liška
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Hubicka @ 2018-01-23 10:04 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches

> On 01/19/2018 12:57 PM, Martin Liška wrote:
> > Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ dominant edges,
> > and later one only 11%. It's caused by these 2 benchmarks with a high coverage:
> > 
> 
> Hi.
> 
> I'm sending details about the 2 edges that influence the statistics significantly:
> 
> > 500.perlbench_r: regexec.c.065i.profile:
> >   negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 2429863555 (98.1%)
> 
> 2477714850: 3484:S_regtry(pTHX_ regmatch_info *reginfo, char **startposp)
>         -: 3485:{
> 2477714850: 3486:    CHECKPOINT lastcp;
> 2477714850: 3487:    REGEXP *const rx = reginfo->prog;
> 2477714850: 3488:    regexp *const prog = ReANY(rx);
> 2477714850: 3489:    SSize_t result;
> [snip]
> 2477714850: 8046:    assert(!result ||  locinput - reginfo->strbeg >= 0);
> 2477714850: 8047:    return result ?  locinput - reginfo->strbeg : -1;
>         -:  8048:}
> 
> As seen it return -1 if a regex is not found, which is in case of perlbench very likely branch.
> 
> > 
> > and 523.xalancbmk_r:
> > build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 (100.0%)
> 
> 1221735072:   74:int NameDatatypeValidator::compare(const XMLCh* const lValue
>         -:   75:                                   , const XMLCh* const rValue
>         -:   76:                                   ,       MemoryManager*     const)
>         -:   77:{
> 1221735072:   78:    return ( XMLString::equals(lValue, rValue)? 0 : -1);
>         -:   79:}
>         -:   80:
> 
> IP profile dump file:
> 
> xercesc_2_7::NameDatatypeValidator::compare (struct NameDatatypeValidator * const this, const XMLCh * const lValue, const XMLCh * const rValue, struct MemoryManager * const D.17157)
> {
>   bool _1;
>   int iftmp.0_2;
> 
>   <bb 2> [count: 1221735072]:
>   # DEBUG BEGIN_STMT
>   _1 = xercesc_2_7::XMLString::equals (lValue_4(D), rValue_5(D));
>   if (_1 != 0)
>     goto <bb 4>; [0.02%]
>   else
>     goto <bb 3>; [99.98%]
> 
>   <bb 3> [count: 1221522453]:
> 
>   <bb 4> [count: 1221735072]:
>   # iftmp.0_2 = PHI <0(2), -1(3)>
>   return iftmp.0_2;
> 
> }
> 
> Likewise, XML values are more commonly non-equal.
> Ok, so may I mark also negative return with PROB_EVEN to track it?

Well, if we start disabling predictors just because we can find benchmark where they do not
perform as expected, we will need to disable them all. It is nature of heuristics to make
mistakes, just they should do something useful for common code.
It is not goal to make heuristics that makes no mistake.
Heuristics is useful either if it have high precision even if it cover few
branches (say noreturn), or if it have large coverage and still some hitrate (say opcode,
call or return based heuristics).

To make things bit more esoteric, in practice this is also bit weighted by fact
how much damage bad prediction does.  For example, call heuristics which predict
non-call path to be taken is likely going to do little damage. Even if call
path is common it is heavy and hard to optimize by presence of call and thus we
don't loose that much in common scenarios.

In the second case:
00073 // -----------------------------------------------------------------------
00074 // Compare methods
00075 // -----------------------------------------------------------------------
00076 int NameDatatypeValidator::compare(const XMLCh* const lValue
00077                                    , const XMLCh* const rValue
00078                                    ,       MemoryManager*     const)
00079 {
00080     return ( XMLString::equals(lValue, rValue)? 0 : -1);
00081 }

I would say it is odd coding style.  I do not see why it is not declared
bool and not returning true/false. 

First case seems bit non-standard too as it looks like usual cache lookup just
the cache frequently fails.

I would be in favor of keeping the prediction with non-50% hitrate thus unless
we run into some performance problems.
If there are only two benchmarks out of say 20 we tried (in spec2000 and 2k17)
it seems fine to me.

There is issue with the way we collect our data.  Using arithmetic mean across
spec is optimizing for overall runtime of all spec benchmarks combined
together.  We more want to look for safer values that do well for average
benchmarks independently how many branches they do.  We could consider making
the statistics script also collect geometric averages across different
benchmarks. If we will see big difference between arithmetic mean and geomean
we will know there is small subset of benchmarks which dominates and we could
decide what to do with the probability.

Honza

> 
> Thanks,
> Martin
> 
> > 
> > Ideas what to do with the predictor for GCC 8 release?
> > Martin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] Remove predictors that are unrealiable.
  2018-01-23 10:04         ` Jan Hubicka
@ 2018-01-23 12:20           ` Martin Liška
  2018-01-23 14:06             ` Jan Hubicka
  0 siblings, 1 reply; 15+ messages in thread
From: Martin Liška @ 2018-01-23 12:20 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: GCC Patches

On 01/23/2018 11:02 AM, Jan Hubicka wrote:
>> On 01/19/2018 12:57 PM, Martin Liška wrote:
>>> Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ dominant edges,
>>> and later one only 11%. It's caused by these 2 benchmarks with a high coverage:
>>>
>>
>> Hi.
>>
>> I'm sending details about the 2 edges that influence the statistics significantly:
>>
>>> 500.perlbench_r: regexec.c.065i.profile:
>>>   negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 2429863555 (98.1%)
>>
>> 2477714850: 3484:S_regtry(pTHX_ regmatch_info *reginfo, char **startposp)
>>         -: 3485:{
>> 2477714850: 3486:    CHECKPOINT lastcp;
>> 2477714850: 3487:    REGEXP *const rx = reginfo->prog;
>> 2477714850: 3488:    regexp *const prog = ReANY(rx);
>> 2477714850: 3489:    SSize_t result;
>> [snip]
>> 2477714850: 8046:    assert(!result ||  locinput - reginfo->strbeg >= 0);
>> 2477714850: 8047:    return result ?  locinput - reginfo->strbeg : -1;
>>         -:  8048:}
>>
>> As seen it return -1 if a regex is not found, which is in case of perlbench very likely branch.
>>
>>>
>>> and 523.xalancbmk_r:
>>> build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 (100.0%)
>>
>> 1221735072:   74:int NameDatatypeValidator::compare(const XMLCh* const lValue
>>         -:   75:                                   , const XMLCh* const rValue
>>         -:   76:                                   ,       MemoryManager*     const)
>>         -:   77:{
>> 1221735072:   78:    return ( XMLString::equals(lValue, rValue)? 0 : -1);
>>         -:   79:}
>>         -:   80:
>>
>> IP profile dump file:
>>
>> xercesc_2_7::NameDatatypeValidator::compare (struct NameDatatypeValidator * const this, const XMLCh * const lValue, const XMLCh * const rValue, struct MemoryManager * const D.17157)
>> {
>>   bool _1;
>>   int iftmp.0_2;
>>
>>   <bb 2> [count: 1221735072]:
>>   # DEBUG BEGIN_STMT
>>   _1 = xercesc_2_7::XMLString::equals (lValue_4(D), rValue_5(D));
>>   if (_1 != 0)
>>     goto <bb 4>; [0.02%]
>>   else
>>     goto <bb 3>; [99.98%]
>>
>>   <bb 3> [count: 1221522453]:
>>
>>   <bb 4> [count: 1221735072]:
>>   # iftmp.0_2 = PHI <0(2), -1(3)>
>>   return iftmp.0_2;
>>
>> }
>>
>> Likewise, XML values are more commonly non-equal.
>> Ok, so may I mark also negative return with PROB_EVEN to track it?
> 
> Well, if we start disabling predictors just because we can find benchmark where they do not
> perform as expected, we will need to disable them all. It is nature of heuristics to make
> mistakes, just they should do something useful for common code.
> It is not goal to make heuristics that makes no mistake.
> Heuristics is useful either if it have high precision even if it cover few
> branches (say noreturn), or if it have large coverage and still some hitrate (say opcode,
> call or return based heuristics).
> 
> To make things bit more esoteric, in practice this is also bit weighted by fact
> how much damage bad prediction does.  For example, call heuristics which predict
> non-call path to be taken is likely going to do little damage. Even if call
> path is common it is heavy and hard to optimize by presence of call and thus we
> don't loose that much in common scenarios.
> 
> In the second case:
> 00073 // -----------------------------------------------------------------------
> 00074 // Compare methods
> 00075 // -----------------------------------------------------------------------
> 00076 int NameDatatypeValidator::compare(const XMLCh* const lValue
> 00077                                    , const XMLCh* const rValue
> 00078                                    ,       MemoryManager*     const)
> 00079 {
> 00080     return ( XMLString::equals(lValue, rValue)? 0 : -1);
> 00081 }
> 
> I would say it is odd coding style.  I do not see why it is not declared
> bool and not returning true/false. 
> 
> First case seems bit non-standard too as it looks like usual cache lookup just
> the cache frequently fails.
> 
> I would be in favor of keeping the prediction with non-50% hitrate thus unless
> we run into some performance problems.
> If there are only two benchmarks out of say 20 we tried (in spec2000 and 2k17)
> it seems fine to me.
> 
> There is issue with the way we collect our data.  Using arithmetic mean across
> spec is optimizing for overall runtime of all spec benchmarks combined
> together.  We more want to look for safer values that do well for average
> benchmarks independently how many branches they do.  We could consider making
> the statistics script also collect geometric averages across different
> benchmarks. If we will see big difference between arithmetic mean and geomean
> we will know there is small subset of benchmarks which dominates and we could
> decide what to do with the probability.

Hello.

I fully agree that proper way to do statistics is to do a weight average across
benchmarks how each predictor hits performance. Which is nice, but quite hard
to achieve. Let's talk personally how we can change it in next stage1.

For now, in case of the 'negative return' predictor, I tend to leave it as it is
and install the rest of the patch that makes PROB_EVEN probability.

Works for you Honza?
Martin

> 
> Honza
> 
>>
>> Thanks,
>> Martin
>>
>>>
>>> Ideas what to do with the predictor for GCC 8 release?
>>> Martin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/5] Remove predictors that are unrealiable.
  2018-01-23 12:20           ` Martin Liška
@ 2018-01-23 14:06             ` Jan Hubicka
  0 siblings, 0 replies; 15+ messages in thread
From: Jan Hubicka @ 2018-01-23 14:06 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches

> On 01/23/2018 11:02 AM, Jan Hubicka wrote:
> >> On 01/19/2018 12:57 PM, Martin Liška wrote:
> >>> Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ dominant edges,
> >>> and later one only 11%. It's caused by these 2 benchmarks with a high coverage:
> >>>
> >>
> >> Hi.
> >>
> >> I'm sending details about the 2 edges that influence the statistics significantly:
> >>
> >>> 500.perlbench_r: regexec.c.065i.profile:
> >>>   negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 2429863555 (98.1%)
> >>
> >> 2477714850: 3484:S_regtry(pTHX_ regmatch_info *reginfo, char **startposp)
> >>         -: 3485:{
> >> 2477714850: 3486:    CHECKPOINT lastcp;
> >> 2477714850: 3487:    REGEXP *const rx = reginfo->prog;
> >> 2477714850: 3488:    regexp *const prog = ReANY(rx);
> >> 2477714850: 3489:    SSize_t result;
> >> [snip]
> >> 2477714850: 8046:    assert(!result ||  locinput - reginfo->strbeg >= 0);
> >> 2477714850: 8047:    return result ?  locinput - reginfo->strbeg : -1;
> >>         -:  8048:}
> >>
> >> As seen it return -1 if a regex is not found, which is in case of perlbench very likely branch.
> >>
> >>>
> >>> and 523.xalancbmk_r:
> >>> build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 (100.0%)
> >>
> >> 1221735072:   74:int NameDatatypeValidator::compare(const XMLCh* const lValue
> >>         -:   75:                                   , const XMLCh* const rValue
> >>         -:   76:                                   ,       MemoryManager*     const)
> >>         -:   77:{
> >> 1221735072:   78:    return ( XMLString::equals(lValue, rValue)? 0 : -1);
> >>         -:   79:}
> >>         -:   80:
> >>
> >> IP profile dump file:
> >>
> >> xercesc_2_7::NameDatatypeValidator::compare (struct NameDatatypeValidator * const this, const XMLCh * const lValue, const XMLCh * const rValue, struct MemoryManager * const D.17157)
> >> {
> >>   bool _1;
> >>   int iftmp.0_2;
> >>
> >>   <bb 2> [count: 1221735072]:
> >>   # DEBUG BEGIN_STMT
> >>   _1 = xercesc_2_7::XMLString::equals (lValue_4(D), rValue_5(D));
> >>   if (_1 != 0)
> >>     goto <bb 4>; [0.02%]
> >>   else
> >>     goto <bb 3>; [99.98%]
> >>
> >>   <bb 3> [count: 1221522453]:
> >>
> >>   <bb 4> [count: 1221735072]:
> >>   # iftmp.0_2 = PHI <0(2), -1(3)>
> >>   return iftmp.0_2;
> >>
> >> }
> >>
> >> Likewise, XML values are more commonly non-equal.
> >> Ok, so may I mark also negative return with PROB_EVEN to track it?
> > 
> > Well, if we start disabling predictors just because we can find benchmark where they do not
> > perform as expected, we will need to disable them all. It is nature of heuristics to make
> > mistakes, just they should do something useful for common code.
> > It is not goal to make heuristics that makes no mistake.
> > Heuristics is useful either if it have high precision even if it cover few
> > branches (say noreturn), or if it have large coverage and still some hitrate (say opcode,
> > call or return based heuristics).
> > 
> > To make things bit more esoteric, in practice this is also bit weighted by fact
> > how much damage bad prediction does.  For example, call heuristics which predict
> > non-call path to be taken is likely going to do little damage. Even if call
> > path is common it is heavy and hard to optimize by presence of call and thus we
> > don't loose that much in common scenarios.
> > 
> > In the second case:
> > 00073 // -----------------------------------------------------------------------
> > 00074 // Compare methods
> > 00075 // -----------------------------------------------------------------------
> > 00076 int NameDatatypeValidator::compare(const XMLCh* const lValue
> > 00077                                    , const XMLCh* const rValue
> > 00078                                    ,       MemoryManager*     const)
> > 00079 {
> > 00080     return ( XMLString::equals(lValue, rValue)? 0 : -1);
> > 00081 }
> > 
> > I would say it is odd coding style.  I do not see why it is not declared
> > bool and not returning true/false. 
> > 
> > First case seems bit non-standard too as it looks like usual cache lookup just
> > the cache frequently fails.
> > 
> > I would be in favor of keeping the prediction with non-50% hitrate thus unless
> > we run into some performance problems.
> > If there are only two benchmarks out of say 20 we tried (in spec2000 and 2k17)
> > it seems fine to me.
> > 
> > There is issue with the way we collect our data.  Using arithmetic mean across
> > spec is optimizing for overall runtime of all spec benchmarks combined
> > together.  We more want to look for safer values that do well for average
> > benchmarks independently how many branches they do.  We could consider making
> > the statistics script also collect geometric averages across different
> > benchmarks. If we will see big difference between arithmetic mean and geomean
> > we will know there is small subset of benchmarks which dominates and we could
> > decide what to do with the probability.
> 
> Hello.
> 
> I fully agree that proper way to do statistics is to do a weight average across
> benchmarks how each predictor hits performance. Which is nice, but quite hard
> to achieve. Let's talk personally how we can change it in next stage1.
> 
> For now, in case of the 'negative return' predictor, I tend to leave it as it is
> and install the rest of the patch that makes PROB_EVEN probability.

OK, makes sense to me.
Honza
> 
> Works for you Honza?
> Martin
> 
> > 
> > Honza
> > 
> >>
> >> Thanks,
> >> Martin
> >>
> >>>
> >>> Ideas what to do with the predictor for GCC 8 release?
> >>> Martin

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-01-23 13:49 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-09 11:23 Tweak predictors based on SPEC2006 and SPEC2017 Martin Liška
2018-01-09 11:26 ` [PATCH 1/5] Fix usage of analyze_brprob.py script Martin Liška
2018-01-09 11:30 ` [PATCH 2/5] Introduce PROB_UNINITIALIZED constant and use it in, predict.def Martin Liška
2018-01-11 10:44   ` Jan Hubicka
2018-01-09 11:40 ` [PATCH 3/5] Adjust predictor values according to SPEC2006 and, SPEC2017 Martin Liška
2018-01-09 15:12   ` Jan Hubicka
2018-01-09 11:46 ` [PATCH 4/5] Remove predictors that are unrealiable Martin Liška
2018-01-11 10:41   ` Jan Hubicka
2018-01-19 13:21     ` Martin Liška
2018-01-22 14:44       ` Martin Liška
2018-01-23 10:04         ` Jan Hubicka
2018-01-23 12:20           ` Martin Liška
2018-01-23 14:06             ` Jan Hubicka
2018-01-09 11:48 ` [PATCH 5/5] Fix test-suite fallout Martin Liška
2018-01-12  9:13   ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).