From: Alexander Monakov <amonakov@ispras.ru>
To: "Joshi, Tejas Sanjay" <TejasSanjay.Joshi@amd.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
"honza.hubicka@gmail.com" <honza.hubicka@gmail.com>,
"Kumar, Venkataramanan" <Venkataramanan.Kumar@amd.com>
Subject: RE: [PATCH][X86_64] Separate znver4 insn reservations from older znvers
Date: Mon, 21 Nov 2022 18:30:01 +0300 (MSK) [thread overview]
Message-ID: <27b06e9f-1e80-585d-624e-6f50475a5aa8@ispras.ru> (raw)
In-Reply-To: <DM6PR12MB47955528574932E8B55DD545E30A9@DM6PR12MB4795.namprd12.prod.outlook.com>
On Mon, 21 Nov 2022, Joshi, Tejas Sanjay wrote:
> I have addressed all your comments in the patch attached here. I have also
> used znver4-direct for avx512 insns.
Thanks.
> * This patch increased the insn-automata.cc size from 201502 to 214902.
Assuming it's the number of lines of code, I have 102847, perhaps you're
measuring without my patches? You can use 'size -A gcc/insn-automata.o'
to measure binary size growth.
> * Compile time and binary size on my machine remains same.
> * Make check and bootstrap build have no issues.
> * Spec cpu2017 also don't have any issues with this patch.
>
> Is this ok for trunk?
I cannot approve or reject your patch, this is up to Honza who I believe
was investigating if combining this with older Zen models makes sense.
In the meantime, I see a few more issues that can be easily corrected,
please see below.
> --- /dev/null
> +++ b/gcc/config/i386/znver4.md
> +;; FSQRT
> +(define_insn_reservation "znver4_fsqrt" 22
> + (and (eq_attr "cpu" "znver4")
> + (and (eq_attr "type" "fpspc")
> + (and (eq_attr "mode" "XF")
> + (eq_attr "memory" "none"))))
> + "znver4-direct,znver4-fdiv*20")
This should be znver4-fdiv*10 (not *20) according to Agner Fog's measurements.
> +;; FDIV
> +(define_insn_reservation "znver4_fp_div" 15
> + (and (eq_attr "cpu" "znver4")
> + (and (eq_attr "type" "fdiv")
> + (eq_attr "memory" "none")))
> + "znver4-direct,znver4-fdiv*15")
znver4-fdiv*6 instead of *15 here and in two patterns following this one.
> +(define_insn_reservation "znver4_sse_div_pd" 13
> + (and (eq_attr "cpu" "znver4")
> + (and (eq_attr "type" "ssediv")
> + (and (eq_attr "mode" "V4DF,V2DF,V1DF")
> + (eq_attr "memory" "none"))))
> + "znver4-direct,znver4-fdiv*7")
Agner Fog's measurements indicate fdiv*5 here.
> +
> +(define_insn_reservation "znver4_sse_div_ps" 10
> + (and (eq_attr "cpu" "znver4")
> + (and (eq_attr "type" "ssediv")
> + (and (eq_attr "mode" "V8SF,V4SF,V2SF,SF")
> + (eq_attr "memory" "none"))))
> + "znver4-direct,znver4-fdiv*5")
Agner Fog's measurements indicate fdiv*3 here.
> +
> +(define_insn_reservation "znver4_sse_div_pd_load" 20
> + (and (eq_attr "cpu" "znver4")
> + (and (eq_attr "type" "ssediv")
> + (and (eq_attr "mode" "V4DF,V2DF,V1DF")
> + (eq_attr "memory" "load"))))
> + "znver4-direct,znver4-load,znver4-fdiv*7")
fdiv*5?
> +
> +(define_insn_reservation "znver4_sse_div_ps_load" 17
> + (and (eq_attr "cpu" "znver4")
> + (and (eq_attr "type" "ssediv")
> + (and (eq_attr "mode" "V8SF,V4SF,V2SF,SF")
> + (eq_attr "memory" "load"))))
> + "znver4-direct,znver4-load,znver4-fdiv*5")
fdiv*3?
> +(define_insn_reservation "znver4_sse_div_pd_evex" 13
> + (and (eq_attr "cpu" "znver4")
> + (and (eq_attr "type" "ssediv")
> + (and (eq_attr "mode" "V8DF")
> + (eq_attr "memory" "none"))))
> + "znver4-direct,znver4-fdiv*7")
This should be twice as much as the corresponding SSE/AVX instruction
(fdiv*14 or fdiv*10; Agner Fog measured 9 cycles as reciprocal throughput).
> +
> +(define_insn_reservation "znver4_sse_div_ps_evex" 10
> + (and (eq_attr "cpu" "znver4")
> + (and (eq_attr "type" "ssediv")
> + (and (eq_attr "mode" "V16SF")
> + (eq_attr "memory" "none"))))
> + "znver4-direct,znver4-fdiv*5")
Likewise (fdiv*6).
> +(define_insn_reservation "znver4_sse_div_pd_evex_load" 20
> + (and (eq_attr "cpu" "znver4")
> + (and (eq_attr "type" "ssediv")
> + (and (eq_attr "mode" "V8DF")
> + (eq_attr "memory" "load"))))
> + "znver4-direct,znver4-load,znver4-fdiv*7")
Likewise.
> +(define_insn_reservation "znver4_sse_div_ps_evex_load" 17
> + (and (eq_attr "cpu" "znver4")
> + (and (eq_attr "type" "ssediv")
> + (and (eq_attr "mode" "V16SF")
> + (eq_attr "memory" "load"))))
> + "znver4-direct,znver4-load,znver4-fdiv*5")
Likewise.
Alexander
next prev parent reply other threads:[~2022-11-21 15:30 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-14 16:18 Joshi, Tejas Sanjay
2022-11-14 18:51 ` Alexander Monakov
2022-11-15 12:08 ` Joshi, Tejas Sanjay
2022-11-15 12:51 ` Alexander Monakov
2022-11-21 11:40 ` Joshi, Tejas Sanjay
2022-11-21 15:30 ` Alexander Monakov [this message]
2022-12-01 11:28 ` Joshi, Tejas Sanjay
2022-12-01 19:01 ` Alexander Monakov
2022-12-12 21:41 ` Jan Hubička
2022-12-22 17:34 ` Joshi, Tejas Sanjay
2023-01-03 14:36 ` Jan Hubicka
2023-01-03 14:52 ` Alexander Monakov
2023-01-03 18:28 ` Jan Hubicka
2023-01-05 5:52 ` Joshi, Tejas Sanjay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27b06e9f-1e80-585d-624e-6f50475a5aa8@ispras.ru \
--to=amonakov@ispras.ru \
--cc=TejasSanjay.Joshi@amd.com \
--cc=Venkataramanan.Kumar@amd.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=honza.hubicka@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).