On 4/24/24 12:22, Robin Dapp wrote: > The dynamic icounts looks sane (vs. Apr 10 snapshot) except for a >> regression in x264 which is likely independent of the chaos going on. >> >>      Apr 10     |     Apr 23      | >>   109f1b28fc94  |  6f0a646dd2fc   | >> ----------------+-----------------+-------- >> 276,584,692,883 | 277,816,987,018 |  -0.45% >> 913,452,236,000 | 927,291,935,180 |  -1.52% >> 903,916,092,805 | 915,364,006,176 |  -1.27% > x264 uses widening arithmetic so it could be the reverts. > Can you compare the hot functions (e.g. x264_pixel_sad_16x16) Function                                     old     new   delta x264_pixel_sad_x4_16x8.lto_priv             5188    5288    +100 x264_pixel_sad_x4_8x16.lto_priv             5844    5924     +80 x264_pixel_sad_x3_16x8.lto_priv             3904    3980     +76 x264_pixel_sad_x4_16x16.lto_priv             834     898     +64 x264_pixel_sad_x3_8x16.lto_priv             4408    4468     +60 x264_pixel_sad_x4_8x8.lto_priv              3010    3058     +48 x264_pixel_sad_x3_8x8.lto_priv              2290    2338     +48 ... ... x264_pixel_sad_x4_4x8.lto_priv              1366    1362      -4 x264_pixel_sad_x4_4x4.lto_priv               716     712      -4 x264_pixel_sad_4x8.lto_priv                  332     328      -4 x264_pixel_sad_4x4.lto_priv                  172     168      -4 hpel_filter.lto_priv                         984     980      -4 > if anything stands out surrounding the vwadd.wv for example? Yeah it does:  not specifically in the routine you mentioned above but in its various brethren: see attached objdump for x264_pixel_sad_x4_16x16 () for the 2 cases. -Vineet