public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Palmer Dabbelt <palmer@dabbelt.com>
To: Jeff Law <jlaw@ventanamicro.com>
Cc: xry111@xry111.site, jiawei@iscas.ac.cn, gcc-patches@gcc.gnu.org,
	kito.cheng@sifive.com, christoph.muellner@vrull.eu,
	wuwei2016@iscas.ac.cn, shihua@iscas.ac.cn, shiyulong@iscas.ac.cn,
	chenyixuan@iscas.ac.cn
Subject: Re: TARGET_RTX_COSTS and pipeline latency vs. variable-latency instructions (was Re: [PATCH] RISC-V: Add XiangShan Nanhu microarchitecture.)
Date: Mon, 25 Mar 2024 13:13:55 -0700 (PDT)	[thread overview]
Message-ID: <mhng-0c2148c5-3ef5-4480-8989-746d7deee700@palmer-ri-x1c9> (raw)
In-Reply-To: <af9b99ce-26d8-4dc1-b471-21ee716d463e@ventanamicro.com>

On Mon, 25 Mar 2024 12:59:14 PDT (-0700), Jeff Law wrote:
>
>
> On 3/25/24 1:48 PM, Xi Ruoyao wrote:
>> On Mon, 2024-03-18 at 20:54 -0600, Jeff Law wrote:
>>>> +/* Costs to use when optimizing for xiangshan nanhu.  */
>>>> +static const struct riscv_tune_param xiangshan_nanhu_tune_info = {
>>>> +  {COSTS_N_INSNS (3), COSTS_N_INSNS (3)},	/* fp_add */
>>>> +  {COSTS_N_INSNS (3), COSTS_N_INSNS (3)},	/* fp_mul */
>>>> +  {COSTS_N_INSNS (10), COSTS_N_INSNS (20)},	/* fp_div */
>>>> +  {COSTS_N_INSNS (3), COSTS_N_INSNS (3)},	/* int_mul */
>>>> +  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},	/* int_div */
>>>> +  6,						/* issue_rate */
>>>> +  3,						/* branch_cost */
>>>> +  3,						/* memory_cost */
>>>> +  3,						/* fmv_cost */
>>>> +  true,						/* slow_unaligned_access */
>>>> +  false,					/* use_divmod_expansion */
>>>> +  RISCV_FUSE_ZEXTW | RISCV_FUSE_ZEXTH,          /* fusible_ops */
>>>> +  NULL,						/* vector cost */
>>
>>> Is your integer division really that fast?  The table above essentially
>>> says that your cpu can do integer division in 6 cycles.
>>
>> Hmm, I just seen I've coded some even smaller value for LoongArch CPUs
>> so forgive me for "hijacking" this thread...
>>
>> The problem seems integer division may spend different number of cycles
>> for different inputs: on LoongArch LA664 I've observed 5 cycles for some
>> inputs and 39 cycles for other inputs.
>>
>> So should we use the minimal value, the maximum value, or something in-
>> between for TARGET_RTX_COSTS and pipeline descriptions?
> Yea, early outs are relatively common in the actual hardware
> implementation.
>
> The biggest reason to refine the cost of a division is so that we've got
> a reasonably accurate cost for division by a constant -- which can often
> be done with multiplication by reciprocal sequence.  The multiplication
> by reciprocal sequence will use mult, add, sub, shadd insns and you need
> a reasonable cost model for those so you can compare against the cost of
> a hardware division.
>
> So to answer your question.  Choose something sensible, you probably
> don't want the fastest case and you may not want the slowest case.

Maybe we should have some sort of per-bit-set cost hook for mul/div?  
Without that we're kind of just guessing at whether the implmentation 
has early outs based on hueristics used to implicitly generate the cost 
models.

Not sure that's really worth the complexity, though...

> Jeff

  reply	other threads:[~2024-03-25 20:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27  8:52 [PATCH] RISC-V: Add XiangShan Nanhu microarchitecture Jiawei
2024-03-19  2:54 ` Jeff Law
2024-03-19 12:43   ` jiawei
2024-03-25 19:48   ` TARGET_RTX_COSTS and pipeline latency vs. variable-latency instructions (was Re: [PATCH] RISC-V: Add XiangShan Nanhu microarchitecture.) Xi Ruoyao
2024-03-25 19:59     ` Jeff Law
2024-03-25 20:13       ` Palmer Dabbelt [this message]
2024-03-25 20:27         ` Jeff Law
2024-03-25 20:31           ` Palmer Dabbelt
2024-03-25 20:49             ` Jeff Law
2024-03-25 20:57               ` Palmer Dabbelt
2024-03-25 21:41                 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mhng-0c2148c5-3ef5-4480-8989-746d7deee700@palmer-ri-x1c9 \
    --to=palmer@dabbelt.com \
    --cc=chenyixuan@iscas.ac.cn \
    --cc=christoph.muellner@vrull.eu \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jiawei@iscas.ac.cn \
    --cc=jlaw@ventanamicro.com \
    --cc=kito.cheng@sifive.com \
    --cc=shihua@iscas.ac.cn \
    --cc=shiyulong@iscas.ac.cn \
    --cc=wuwei2016@iscas.ac.cn \
    --cc=xry111@xry111.site \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).