public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jeff Law <jlaw@ventanamicro.com>
To: Palmer Dabbelt <palmer@dabbelt.com>
Cc: xry111@xry111.site, jiawei@iscas.ac.cn, gcc-patches@gcc.gnu.org,
	kito.cheng@sifive.com, christoph.muellner@vrull.eu,
	wuwei2016@iscas.ac.cn, shihua@iscas.ac.cn, shiyulong@iscas.ac.cn,
	chenyixuan@iscas.ac.cn
Subject: Re: TARGET_RTX_COSTS and pipeline latency vs. variable-latency instructions (was Re: [PATCH] RISC-V: Add XiangShan Nanhu microarchitecture.)
Date: Mon, 25 Mar 2024 14:27:34 -0600	[thread overview]
Message-ID: <3315b6a9-96df-415d-b82f-806dada10154@ventanamicro.com> (raw)
In-Reply-To: <mhng-0c2148c5-3ef5-4480-8989-746d7deee700@palmer-ri-x1c9>



On 3/25/24 2:13 PM, Palmer Dabbelt wrote:
> On Mon, 25 Mar 2024 12:59:14 PDT (-0700), Jeff Law wrote:
>>
>>
>> On 3/25/24 1:48 PM, Xi Ruoyao wrote:
>>> On Mon, 2024-03-18 at 20:54 -0600, Jeff Law wrote:
>>>>> +/* Costs to use when optimizing for xiangshan nanhu.  */
>>>>> +static const struct riscv_tune_param xiangshan_nanhu_tune_info = {
>>>>> +  {COSTS_N_INSNS (3), COSTS_N_INSNS (3)},    /* fp_add */
>>>>> +  {COSTS_N_INSNS (3), COSTS_N_INSNS (3)},    /* fp_mul */
>>>>> +  {COSTS_N_INSNS (10), COSTS_N_INSNS (20)},    /* fp_div */
>>>>> +  {COSTS_N_INSNS (3), COSTS_N_INSNS (3)},    /* int_mul */
>>>>> +  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},    /* int_div */
>>>>> +  6,                        /* issue_rate */
>>>>> +  3,                        /* branch_cost */
>>>>> +  3,                        /* memory_cost */
>>>>> +  3,                        /* fmv_cost */
>>>>> +  true,                        /* slow_unaligned_access */
>>>>> +  false,                    /* use_divmod_expansion */
>>>>> +  RISCV_FUSE_ZEXTW | RISCV_FUSE_ZEXTH,          /* fusible_ops */
>>>>> +  NULL,                        /* vector cost */
>>>
>>>> Is your integer division really that fast?  The table above essentially
>>>> says that your cpu can do integer division in 6 cycles.
>>>
>>> Hmm, I just seen I've coded some even smaller value for LoongArch CPUs
>>> so forgive me for "hijacking" this thread...
>>>
>>> The problem seems integer division may spend different number of cycles
>>> for different inputs: on LoongArch LA664 I've observed 5 cycles for some
>>> inputs and 39 cycles for other inputs.
>>>
>>> So should we use the minimal value, the maximum value, or something in-
>>> between for TARGET_RTX_COSTS and pipeline descriptions?
>> Yea, early outs are relatively common in the actual hardware
>> implementation.
>>
>> The biggest reason to refine the cost of a division is so that we've got
>> a reasonably accurate cost for division by a constant -- which can often
>> be done with multiplication by reciprocal sequence.  The multiplication
>> by reciprocal sequence will use mult, add, sub, shadd insns and you need
>> a reasonable cost model for those so you can compare against the cost of
>> a hardware division.
>>
>> So to answer your question.  Choose something sensible, you probably
>> don't want the fastest case and you may not want the slowest case.
> 
> Maybe we should have some sort of per-bit-set cost hook for mul/div? 
> Without that we're kind of just guessing at whether the implmentation 
> has early outs based on hueristics used to implicitly generate the cost 
> models.
> 
> Not sure that's really worth the complexity, though...
I'd doubt it's worth the complexity.  Picking some reasonable value gets 
you the vast majority of the benefit.   Something like
COSTS_N_INSNS(6) is enough to get CSE to trigger.  So what's left is a 
reasonable cost, particularly for the division-by-constant case where we 
need a ceiling for synth_mult.

Jeff

  reply	other threads:[~2024-03-25 20:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27  8:52 [PATCH] RISC-V: Add XiangShan Nanhu microarchitecture Jiawei
2024-03-19  2:54 ` Jeff Law
2024-03-19 12:43   ` jiawei
2024-03-25 19:48   ` TARGET_RTX_COSTS and pipeline latency vs. variable-latency instructions (was Re: [PATCH] RISC-V: Add XiangShan Nanhu microarchitecture.) Xi Ruoyao
2024-03-25 19:59     ` Jeff Law
2024-03-25 20:13       ` Palmer Dabbelt
2024-03-25 20:27         ` Jeff Law [this message]
2024-03-25 20:31           ` Palmer Dabbelt
2024-03-25 20:49             ` Jeff Law
2024-03-25 20:57               ` Palmer Dabbelt
2024-03-25 21:41                 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3315b6a9-96df-415d-b82f-806dada10154@ventanamicro.com \
    --to=jlaw@ventanamicro.com \
    --cc=chenyixuan@iscas.ac.cn \
    --cc=christoph.muellner@vrull.eu \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jiawei@iscas.ac.cn \
    --cc=kito.cheng@sifive.com \
    --cc=palmer@dabbelt.com \
    --cc=shihua@iscas.ac.cn \
    --cc=shiyulong@iscas.ac.cn \
    --cc=wuwei2016@iscas.ac.cn \
    --cc=xry111@xry111.site \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).