From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=Mu+B=DM=dabbelt.com=palmer@sourceware.org>
Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632])
	by sourceware.org (Postfix) with ESMTPS id 8B15D385735E
	for <gcc-patches@gcc.gnu.org>; Wed, 26 Jul 2023 16:00:15 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8B15D385735E
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=dabbelt.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=dabbelt.com
Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1bb119be881so52605625ad.3
        for <gcc-patches@gcc.gnu.org>; Wed, 26 Jul 2023 09:00:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=dabbelt-com.20221208.gappssmtp.com; s=20221208; t=1690387214; x=1690992014;
        h=content-transfer-encoding:mime-version:message-id:to:from:cc
         :in-reply-to:subject:date:from:to:cc:subject:date:message-id
         :reply-to;
        bh=fIWDuAh7nw6QOEaaLCRaCCnM52fuFRea+MSmFvfvu/A=;
        b=Lx7iraQf6Ka7FUdfpzpSbZeVLFWJAdNdS/dVodEWsjilz8yfIUY4uhZbXvRwi9wFKD
         g1dizzM5CqpYZHkqYqonBHRpjt8QOBoi3uuUTww6rpcI1EpPDrVee4v3T4aOOXBV5v0S
         Qct1YolI4/pw7nVKTB826TxCGaEPTlTGEzMnuEOqPtxR8qYP1ElmFgC7IzwiidZbn1Ts
         0mKS6tTdxIWe0tHxrBzxQT9WgGhEpV3+UqfRX5NxRkrU2k26FqkzZXmwJAYWnnlBpIX5
         NyAJMh6wwJWWEHpV+xxpJsMkbmG4BO3wf2G5ZZyzb80VWddqxS0XBfxITBFGoaf7xo8C
         kS2A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1690387214; x=1690992014;
        h=content-transfer-encoding:mime-version:message-id:to:from:cc
         :in-reply-to:subject:date:x-gm-message-state:from:to:cc:subject:date
         :message-id:reply-to;
        bh=fIWDuAh7nw6QOEaaLCRaCCnM52fuFRea+MSmFvfvu/A=;
        b=PUZ/aFQ/ZaHdjdIbfdp2tuIAQHTC4ZL3NRSkt1FXqctdufK7UMb/iT2xd24OtBklSD
         Co0PHlnZ6Hem+ufhAjVXpE0NkZpgB3X3Q9Z6XvO85HY+LMkAbqMOSq9YO3lMzmSn3ujf
         w/YN9YbLmXBNbv+UlUFs8K9eKTLFEX7dS1F4MFsAOzAsGorGkowPiD72LNTTH8M1U78+
         vylE6wra2qytwzCsO2snevIjsW8ipXSo58FuXTSwxoJVUtuM10tN7Lw9G9WgXDl5STSd
         nZi/VAAs2+8itVohv0P4l23Lu4XexHnvELdcVvAmPQb1qZ0HWaROSXWSCb+sJIEt4ICY
         AtdQ==
X-Gm-Message-State: ABy/qLa/dIyQiAOuolI5JQQvQv4N8UtMxsBPOQ+fBiN/RTnRLryULRhS
	kW9bQylNXXe3r+2zkzCT+97QEg==
X-Google-Smtp-Source: APBJJlHMQ63v7yA92Ba8UV9H+3iwjRh7k3goOvwhrG6zWeWqeaTMEoIcaT0xXhQWNs+dkfMVeRsN8Q==
X-Received: by 2002:a17:902:a703:b0:1b8:2c2a:962e with SMTP id w3-20020a170902a70300b001b82c2a962emr2386275plq.33.1690387213851;
        Wed, 26 Jul 2023 09:00:13 -0700 (PDT)
Received: from localhost ([135.180.227.0])
        by smtp.gmail.com with ESMTPSA id iw3-20020a170903044300b001bba7aab838sm6295693plb.162.2023.07.26.09.00.13
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Wed, 26 Jul 2023 09:00:13 -0700 (PDT)
Date: Wed, 26 Jul 2023 09:00:13 -0700 (PDT)
X-Google-Original-Date: Wed, 26 Jul 2023 09:00:11 PDT (-0700)
Subject:     Re: Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding
In-Reply-To: <CALLt3TjFNsjOeDRvpCMdf1vbxnn6fX7HCL0Jv5t+kSDG0OzuPA@mail.gmail.com>
CC: pan2.li@intel.com, yanzhang.wang@intel.com, gcc-patches@gcc.gnu.org,
  rdapp.gcc@gmail.com, juzhe.zhong@rivai.ai
From: Palmer Dabbelt <palmer@dabbelt.com>
To: gcc-patches@gcc.gnu.org
Message-ID: <mhng-fcc1dabc-042a-4724-b320-0433678a708c@palmer-ri-x1c9>
Mime-Version: 1.0 (MHng)
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Wed, 26 Jul 2023 08:34:14 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
> I would say LCM/PRE is the key of this set of static rounding model
> intrinsic, otherwise I think it's will push people to using dynamic with
> fesetrouding mode or inline asm to set the rounding mode for performance
> issue - it's kind of opposite way of the design concept, we want to provide
> a reliable way with performance to precisely control the ronding model.
>
> For the function call stuff that could be resolved by fenv_access pragma in
> theory, since it can be an annotation to tell compiler some function has
> modify fenv or not, but unfortunately it’s not well modeled within GCC yet,
> so we must did the conservative to make sure we didn't break anything.
>
> And also the LLVM side is trying to implement some simple LCM/PRE to
> optimize that, so I believe we need LCM/PRE based mode switching to do that.

IMO that's a perfectly reasonably way to start: let's just get something 
that's correct and simple, if we need to do more complicated stuff later 
we can always add it.

There's going to be a very small amount of this code written my a very 
small number of people (that are likely very close to the compiler teams 
doing the optimizations here), so we can just all work with each other 
to sort out any important performance issues as we go.

I think whether LCM or entry/exit performs better is probably just going 
to boil down to some uarch/workload specific decisions, so as long as 
whatever we have is correct and reasonably simple it seems fine for now.  
Given how little of this code there's going to be it's probably not 
worth spending a ton of time on things until we have a concrete use case 
to drive things.

Let's just make sure to also update the intrinsic spec to get rid of the 
grey area here, that way we can point to something if we want to 
optimize differently in the future.

> Li, Pan2 <pan2.li@intel.com>於 2023年7月26日 週三，22:31寫道：
>
>> As Juzhe mentioned, the problem of the CALL is resolved by LCM/PRE
>> similar to the VSETVL pass, which is well proofed up to a point.
>>
>>
>>
>> I would like to propose that being focus and moving forward for this patch
>> itself, the underlying other RVV floating point API support and the RVV
>> instrinsic API fully tests depend on this.
>>
>>
>>
>> Of course, I am working on PATCH v8 and thanks again for Robin’s comments.
>>
>>
>>
>> Pan
>>
>>
>>
>> *From:* 钟居哲 <juzhe.zhong@rivai.ai>
>> *Sent:* Wednesday, July 26, 2023 10:18 PM
>> *To:* rdapp.gcc <rdapp.gcc@gmail.com>; Li, Pan2 <pan2.li@intel.com>
>> *Cc:* rdapp.gcc <rdapp.gcc@gmail.com>; kito.cheng <kito.cheng@sifive.com>;
>> gcc-patches <gcc-patches@gcc.gnu.org>; Wang, Yanzhang <
>> yanzhang.wang@intel.com>
>> *Subject:* Re: Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point
>> dynamic rounding
>>
>>
>>
>> Explicitly backup and restore for each intrinsic just the same as we did
>> for CALL in this patch.
>>
>>
>>
>> I can't have the data to prove how good we use LCM/PRE of mode switching
>> but I trust it.
>>
>>
>>
>> Since the the LCM/PRE is the key optimization method of VSETVL PASS which
>> is doing good job on VSETVL instruction optimizations.
>>
>>
>>
>> I don't we should give up LCM/PRE chance then just backup and restore for
>> each intrinsic bindly.
>>
>>
>>
>>
>> ------------------------------
>>
>> juzhe.zhong@rivai.ai
>>
>>
>>
>> *From:* Robin Dapp <rdapp.gcc@gmail.com>
>>
>> *Date:* 2023-07-26 21:46
>>
>> *To:* juzhe.zhong <juzhe.zhong@rivai.ai>; Li, Pan2 <pan2.li@intel.com>
>>
>> *CC:* rdapp.gcc <rdapp.gcc@gmail.com>; Kito Cheng <kito.cheng@sifive.com>;
>> gcc-patches@gcc.gnu.org; Wang, Yanzhang <yanzhang.wang@intel.com>
>>
>> *Subject:* Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point
>> dynamic rounding
>>
>> > current llvm didn't do any pre optimization.  They always
>>
>> > backup+restore for each rounding mode intrinsic
>>
>>
>>
>> I see.  There is still the option of lazily restoring the
>>
>> (entry) FRM before a function call but not read the FRM
>>
>> after every call.  Do we have any data on how good or bad the
>>
>> mode-switching LCM works when we explicitly backup and restore
>>
>> for each intrinsic?
>>
>>
>>
>> Regards
>>
>> Robin
>>
>>
>>
>>