From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <vineetg@rivosinc.com>
Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com
 [IPv6:2607:f8b0:4864:20::52b])
 by sourceware.org (Postfix) with ESMTPS id BC3533857B86
 for <gcc-patches@gcc.gnu.org>; Tue, 24 May 2022 19:38:04 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BC3533857B86
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivosinc.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com
Received: by mail-pg1-x52b.google.com with SMTP id a9so15110448pgv.12
 for <gcc-patches@gcc.gnu.org>; Tue, 24 May 2022 12:38:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=rivosinc-com.20210112.gappssmtp.com; s=20210112;
 h=message-id:date:mime-version:user-agent:subject:content-language:to
 :cc:references:from:in-reply-to:content-transfer-encoding;
 bh=oS2Z+kRzhviA2E+vriWiJkelr36XMYch1mnbhCPLSQg=;
 b=Zx6gYDga4UjjByJH1biIwioxiBBfXR3wPVrxyMYoDR1CRxPfOzJUqdXC+mE1Rvmepz
 GHbqNIPCNrGuyzXabf6Qgim1jIsPwYy6p3A0hsIFUAdda8ofZxduxnE7NHCKsjEz6M2k
 06ZXUjbSRlOMRjA4kiqhN52zelb0iDw4q1sBVClUWXawjRcSAF3hb9csWMrCM+RYiFqw
 OORmKUO1TKVBbn/7SfQvtLWElNwigYYJsSXbMTRC3rti/y219JS154OTkpKMSlDsct2b
 UflWC7myipi/9OfwhbvaCM1jZ1fJA6X3iAx5G5KRomjeSTUaDykvOWQlEwvgPZQSYaK+
 pDsg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:message-id:date:mime-version:user-agent:subject
 :content-language:to:cc:references:from:in-reply-to
 :content-transfer-encoding;
 bh=oS2Z+kRzhviA2E+vriWiJkelr36XMYch1mnbhCPLSQg=;
 b=54f3XB3yKxtiZ0s9aUKrwTLjLFUyuHwtob8jYkBbLMh7RuSO/yUi5Ric32pOUQ14mT
 ZeR6uUeWzvv1hMMopvalOvrgtCjAcH7jkRtpbngBGg/Tg4u4ELXpf5k1jg5BebVwSPIM
 lUouj9x6xwJAkL7Tx4n7ZrbV3ljzQqi4LtVl6WErP8HrkPlfYiFHe81f8kuuXwxcOkk4
 9+3U4K+kFIhdjz+wnv1i4upy5Ng65Gbc/tmJ/qQY7CXkcZ4VGpxBVcyUXrrW08buT3Zj
 xgkhVF8GlCw+QnaSkE8v8iXfFAHbHBkIIz4B3zFf1knczYd41bIzw3XW4vZK9Y2udiOt
 A6nQ==
X-Gm-Message-State: AOAM530HZiFTQC8BUbZDIxiH9zaa65QtG2bPLjQO4pyBqORLwtH3VPXM
 fHDQaex+eDRgBjzk7RF9LE5lXA==
X-Google-Smtp-Source: ABdhPJzeumN9kYOXRMVZiuk/gvDvHq4qSCCJuF12U+mUpwkFaeLpB3TUpzSNlEhub+//WJJrOZRgCg==
X-Received: by 2002:a62:1413:0:b0:518:4259:200e with SMTP id
 19-20020a621413000000b005184259200emr27751730pfu.41.1653421083647; 
 Tue, 24 May 2022 12:38:03 -0700 (PDT)
Received: from [192.168.50.116] (c-24-4-73-83.hsd1.ca.comcast.net.
 [24.4.73.83]) by smtp.gmail.com with ESMTPSA id
 s7-20020a056a001c4700b0050dc76281dcsm9662518pfw.182.2022.05.24.12.38.02
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Tue, 24 May 2022 12:38:03 -0700 (PDT)
Message-ID: <ce7b0ec0-8663-b2a0-9268-4b7225083d8f@rivosinc.com>
Date: Tue, 24 May 2022 12:38:02 -0700
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.8.1
Subject: Re: [PATCH] [PR/target 105666] RISC-V: Inhibit FP <--> int register
 moves via tune param
Content-Language: en-US
To: Kito Cheng <kito.cheng@gmail.com>,
 Philipp Tomsich <philipp.tomsich@vrull.eu>
Cc: Andrew Waterman <andrew@sifive.com>, GCC Patches
 <gcc-patches@gcc.gnu.org>, gnu-toolchain@rivosinc.com
References: <20220523181209.2208136-1-vineetg@rivosinc.com>
 <CAAeLtUAGS8Ehj9oC+w-KTVUAWbN1OVMJ3pKoLSzKfT5vc_xaUw@mail.gmail.com>
 <CA+yXCZC6s_R34qW=Skdd__dNR0mQoBL7s0Sef7JYYJFkTWCz1A@mail.gmail.com>
From: Vineet Gupta <vineetg@rivosinc.com>
In-Reply-To: <CA+yXCZC6s_R34qW=Skdd__dNR0mQoBL7s0Sef7JYYJFkTWCz1A@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_BARRACUDACENTRAL,
 RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 24 May 2022 19:38:07 -0000


On 5/24/22 00:59, Kito Cheng wrote:
> Committed, thanks!

Thx for the quick action Kito,
Can this be backported to gcc 12 as well ?

Thx,
-Vineet

>
> On Tue, May 24, 2022 at 3:40 AM Philipp Tomsich
> <philipp.tomsich@vrull.eu> wrote:
>> Good catch!
>>
>> On Mon, 23 May 2022 at 20:12, Vineet Gupta <vineetg@rivosinc.com> wrote:
>>
>>> Under extreme register pressure, compiler can use FP <--> int
>>> moves as a cheap alternate to spilling to memory.
>>> This was seen with SPEC2017 FP benchmark 507.cactu:
>>> ML_BSSN_Advect.cc:ML_BSSN_Advect_Body()
>>>
>>> |       fmv.d.x fa5,s9  # PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
>>> | .LVL325:
>>> |       ld      s9,184(sp)              # _12469, %sfp
>>> | ...
>>> | .LVL339:
>>> |       fmv.x.d s4,fa5  # PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
>>> |
>>>
>>> The FMV instructions could be costlier (than stack spill) on certain
>>> micro-architectures, thus this needs to be a per-cpu tunable
>>> (default being to inhibit on all existing RV cpus).
>>>
>>> Testsuite run with new test reports 10 failures without the fix
>>> corresponding to the build variations of pr105666.c
>>>
>>> |               === gcc Summary ===
>>> |
>>> | # of expected passes          123318   (+10)
>>> | # of unexpected failures      34       (-10)
>>> | # of unexpected successes     4
>>> | # of expected failures        780
>>> | # of unresolved testcases     4
>>> | # of unsupported tests        2796
>>>
>>> gcc/Changelog:
>>>
>>>          * config/riscv/riscv.cc: (struct riscv_tune_param): Add
>>>            fmv_cost.
>>>          (rocket_tune_info): Add default fmv_cost 8.
>>>          (sifive_7_tune_info): Ditto.
>>>          (thead_c906_tune_info): Ditto.
>>>          (optimize_size_tune_info): Ditto.
>>>          (riscv_register_move_cost): Use fmv_cost for int<->fp moves.
>>>
>>> gcc/testsuite/Changelog:
>>>
>>>          * gcc.target/riscv/pr105666.c: New test.
>>>
>>> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
>>> ---
>>>   gcc/config/riscv/riscv.cc                 |  9 ++++
>>>   gcc/testsuite/gcc.target/riscv/pr105666.c | 55 +++++++++++++++++++++++
>>>   2 files changed, 64 insertions(+)
>>>   create mode 100644 gcc/testsuite/gcc.target/riscv/pr105666.c
>>>
>>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>>> index ee756aab6940..f3ac0d8865f0 100644
>>> --- a/gcc/config/riscv/riscv.cc
>>> +++ b/gcc/config/riscv/riscv.cc
>>> @@ -220,6 +220,7 @@ struct riscv_tune_param
>>>     unsigned short issue_rate;
>>>     unsigned short branch_cost;
>>>     unsigned short memory_cost;
>>> +  unsigned short fmv_cost;
>>>     bool slow_unaligned_access;
>>>   };
>>>
>>> @@ -285,6 +286,7 @@ static const struct riscv_tune_param rocket_tune_info
>>> = {
>>>     1,                                           /* issue_rate */
>>>     3,                                           /* branch_cost */
>>>     5,                                           /* memory_cost */
>>> +  8,                                           /* fmv_cost */
>>>     true,                                                /*
>>> slow_unaligned_access */
>>>   };
>>>
>>> @@ -298,6 +300,7 @@ static const struct riscv_tune_param
>>> sifive_7_tune_info = {
>>>     2,                                           /* issue_rate */
>>>     4,                                           /* branch_cost */
>>>     3,                                           /* memory_cost */
>>> +  8,                                           /* fmv_cost */
>>>     true,                                                /*
>>> slow_unaligned_access */
>>>   };
>>>
>>> @@ -311,6 +314,7 @@ static const struct riscv_tune_param
>>> thead_c906_tune_info = {
>>>     1,            /* issue_rate */
>>>     3,            /* branch_cost */
>>>     5,            /* memory_cost */
>>> +  8,           /* fmv_cost */
>>>     false,            /* slow_unaligned_access */
>>>   };
>>>
>>> @@ -324,6 +328,7 @@ static const struct riscv_tune_param
>>> optimize_size_tune_info = {
>>>     1,                                           /* issue_rate */
>>>     1,                                           /* branch_cost */
>>>     2,                                           /* memory_cost */
>>> +  8,                                           /* fmv_cost */
>>>     false,                                       /* slow_unaligned_access */
>>>   };
>>>
>>> @@ -4737,6 +4742,10 @@ static int
>>>   riscv_register_move_cost (machine_mode mode,
>>>                            reg_class_t from, reg_class_t to)
>>>   {
>>> +  if ((from == FP_REGS && to == GR_REGS) ||
>>> +      (from == GR_REGS && to == FP_REGS))
>>> +    return tune_param->fmv_cost;
>>> +
>>>     return riscv_secondary_memory_needed (mode, from, to) ? 8 : 2;
>>>   }
>>>
>>> diff --git a/gcc/testsuite/gcc.target/riscv/pr105666.c
>>> b/gcc/testsuite/gcc.target/riscv/pr105666.c
>>> new file mode 100644
>>> index 000000000000..904f3bc0763f
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/riscv/pr105666.c
>>> @@ -0,0 +1,55 @@
>>> +/* Shamelessly plugged off
>>> gcc/testsuite/gcc.c-torture/execute/pr28982a.c.
>>> +
>>> +   The idea is to induce high register pressure for both int/fp registers
>>> +   so that they spill. By default FMV instructions would be used to stash
>>> +   int reg to a fp reg (and vice-versa) but that could be costlier than
>>> +   spilling to stack.  */
>>> +
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=rv64g -ffast-math" } */
>>> +
>>> +#define NITER 4
>>> +#define NVARS 20
>>> +#define MULTI(X) \
>>> +  X( 0), X( 1), X( 2), X( 3), X( 4), X( 5), X( 6), X( 7), X( 8), X( 9), \
>>> +  X(10), X(11), X(12), X(13), X(14), X(15), X(16), X(17), X(18), X(19)
>>> +
>>> +#define DECLAREI(INDEX) inc##INDEX = incs[INDEX]
>>> +#define DECLAREF(INDEX) *ptr##INDEX = ptrs[INDEX], result##INDEX = 5
>>> +#define LOOP(INDEX) result##INDEX += result##INDEX * (*ptr##INDEX),
>>> ptr##INDEX += inc##INDEX
>>> +#define COPYOUT(INDEX) results[INDEX] = result##INDEX
>>> +
>>> +double *ptrs[NVARS];
>>> +double results[NVARS];
>>> +int incs[NVARS];
>>> +
>>> +void __attribute__((noinline))
>>> +foo (int n)
>>> +{
>>> +  int MULTI (DECLAREI);
>>> +  double MULTI (DECLAREF);
>>> +  while (n--)
>>> +    MULTI (LOOP);
>>> +  MULTI (COPYOUT);
>>> +}
>>> +
>>> +double input[NITER * NVARS];
>>> +
>>> +int
>>> +main (void)
>>> +{
>>> +  int i;
>>> +
>>> +  for (i = 0; i < NVARS; i++)
>>> +    ptrs[i] = input + i, incs[i] = i;
>>> +  for (i = 0; i < NITER * NVARS; i++)
>>> +    input[i] = i;
>>> +  foo (NITER);
>>> +  for (i = 0; i < NVARS; i++)
>>> +    if (results[i] != i * NITER * (NITER + 1) / 2)
>>> +      return 1;
>>> +  return 0;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */
>>> +/* { dg-final { scan-assembler-not "\tfmv\\.x\\.d\t" } } */
>>> --
>>> 2.32.0
>>>
>>>