public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Align tight loops to solve cross cacheline issue
@ 2024-05-15  3:04 Haochen Jiang
  2024-05-15  3:04 ` [PATCH 1/2] Adjust generic loop alignment from 16:11:8 to 16 for Intel processors Haochen Jiang
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Haochen Jiang @ 2024-05-15  3:04 UTC (permalink / raw)
  To: gcc-patches; +Cc: hongtao.liu, ubizjak

Hi all,

Recently, we have encountered several random performance regressions in
benchmarks commit to commit. It is caused by cross cacheline issue for
tight loops.

We are trying to solve the issue by two patches. One is adjusting the
loop alignment for generic tune, the other is aligning tight and hot
loops more aggressively.

For SPECINT, we get a 0.85% improvement overall in rates, under option
-O2 -march=x86-64-v3 -mtune=generic on Emerald Rapids.

BenchMarks      EMR Rates
500.perlbench_r -1.21%
502.gcc_r       0.78%
505.mcf_r       0.00%
520.omnetpp_r   0.41%
523.xalancbmk_r 1.33%
525.x264_r      2.83%
531.deepsjeng_r 1.11%
541.leela_r     0.00%
548.exchange2_r 2.36%
557.xz_r        0.98%
Geomean-int     0.85%

Side effect is that we get a 1.40% increase in codesize.

BenchMarks      EMR Codesize
500.perlbench_r 0.70%
502.gcc_r       0.67%
505.mcf_r       3.26%
520.omnetpp_r   0.31%
523.xalancbmk_r 1.15%
525.x264_r      1.11%
531.deepsjeng_r 1.40%
541.leela_r     1.31%
548.exchange2_r 3.06%
557.xz_r        1.04%
Geomean-int     1.40%

Bootstrapped and regtested on x86_64-pc-linux-gnu.

After we committed into trunk for a month, if there isn't any unexpected
happen. We planned to backport it to GCC14.2.

Thx,
Haochen

Haochen Jiang (1):
  Adjust generic loop alignment from 16:11:8 to 16 for Intel processors

liuhongt (1):
  Align tight&hot loop without considering max skipping bytes.

 gcc/config/i386/i386.cc          | 148 ++++++++++++++++++++++++++++++-
 gcc/config/i386/i386.md          |  10 ++-
 gcc/config/i386/x86-tune-costs.h |   2 +-
 3 files changed, 154 insertions(+), 6 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-29  3:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-15  3:04 [PATCH 0/2] Align tight loops to solve cross cacheline issue Haochen Jiang
2024-05-15  3:04 ` [PATCH 1/2] Adjust generic loop alignment from 16:11:8 to 16 for Intel processors Haochen Jiang
2024-05-15  3:04 ` [PATCH 2/2] Align tight&hot loop without considering max skipping bytes Haochen Jiang
2024-05-15  3:30 ` [PATCH 0/2] Align tight loops to solve cross cacheline issue Jiang, Haochen
2024-05-20  3:15   ` Hongtao Liu
2024-05-27  1:33     ` Hongtao Liu
2024-05-29  3:30       ` Jiang, Haochen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).