From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) by sourceware.org (Postfix) with ESMTPS id AC20A3858D1E for ; Fri, 10 Nov 2023 21:21:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AC20A3858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AC20A3858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::d2c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699651291; cv=none; b=TwjKPTtN+QLL38Yr0EOQ9Xq0f4cK87CH+AzQttwQnHKLEyzWTIA/UtNPlxgn4ZxGcLs850dAA3RS2XW+OriALDIgbKOSfxsYQdXrytNx90DY3kTeRFo9kdcspb29VjCM9B9xT4IpsbXbM0FMyIz/OEEWQGN9v4s41cFVVGd95KI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699651291; c=relaxed/simple; bh=NttpqtdlbXPF7mjAbLG/u5k2k/lWGMSIC1rjy7hdI/4=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=WIC0m6Myl0FxgiYlU5T1t4a+UAHKZTNHtaiW3v5AP73keSxzmt90x5MbM0k0wmiIGfVCz0On52w53YMvJl67ElwY3q4naAtObd4nLamkLDi8A2IWYZXSM8U3zfEYtRskaxHJQewu9PiPseNgkqasa482YPiCQk/q+fXYNzdJcb8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-io1-xd2c.google.com with SMTP id ca18e2360f4ac-7aca968a063so98495639f.0 for ; Fri, 10 Nov 2023 13:21:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699651289; x=1700256089; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=qID2B+8ngeI1Hwz9yE0/J6Zm2PYg4BPpP9dGjyXDRSA=; b=lEqis4sxV+5AW5NL+wyKhIdLbXMmL0gQY0bsEy2JFo00GtHUDqbGDzqTgr/GjKBkxw Mj6uszwfV5XHFBSdjtZN1dc/jKt9i1a7r/d5oWBLLWYe25dgqmBNsYNcJYQzEW8KEkXL MSUIONJx0kLEkaZFa5i6xEC8dmHxGKl/V5UDUC7tFcCRE0rKrxjCxLtjH4KgxK9VkmY6 6c1k9IhzpHKkjJU58PdNgVEXdYLItN0j4f3T0qVt7DViUbn5mIXgeeaFaJcOmFc8mTiv zFo8YQNg3RiTh8CM1fvwummxZFxJzagWHCDE2rXofBYZcULUMuwbk+d3lkFzkfwPd2gQ zpXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699651289; x=1700256089; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qID2B+8ngeI1Hwz9yE0/J6Zm2PYg4BPpP9dGjyXDRSA=; b=VCw8FKFeKgGr1V+hzec58wZsMk1KL6/ZDpCJN89q/Bq1SzillJ7vOx9ppeTWz1gu6x sPJ5KyUFjcASz7j5She0vxPpottzMlUlKSLkYj4zzKd3fN5BpTDkY2NspuP5oxNjQfZC RSQEWEP9BmECU/4mBtwREsEN+2N8+QB5Gyss88nj+y7/OrgQl1V77ZJUYGm0F/qXre03 Cp6NnQO6Gj2aWt5Tiv5bvyouZELkYesDRZjUO+2Ooqk9INr1AmlX68ZyVAlQTG70is3g 97ZfGdltYHnFbPRv32TJFD3TJ8enH1Be27MSPGCmsQKlINBfSBQ8UAE73qqRufS+pEO5 4RQg== X-Gm-Message-State: AOJu0YxgC5jC/zf5q7AtDEQ1kVYmP0DxXCwpMOAQooGwaE2pIiUMBESJ AYoP1ZeCk1Gdvz0tVdjc5Wk= X-Google-Smtp-Source: AGHT+IE7aEWDpf2JHjF0NtaxaKkJB1/RGAogJUPl+lOpK/Rvwnd/s02jIY8sf3xGAWNwOkrGUX7ZPQ== X-Received: by 2002:a05:6602:498f:b0:79f:d4e6:5175 with SMTP id eg15-20020a056602498f00b0079fd4e65175mr602832iob.16.1699651288810; Fri, 10 Nov 2023 13:21:28 -0800 (PST) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id m21-20020a02c895000000b00454df11c71csm64015jao.97.2023.11.10.13.21.27 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 10 Nov 2023 13:21:28 -0800 (PST) Message-ID: <9fe0b923-eca5-4a4d-91e4-8318b8e2b1cc@gmail.com> Date: Fri, 10 Nov 2023 14:21:26 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC 1/2] RISC-V: Add support for _Bfloat16. Content-Language: en-US To: Jin Ma , gcc-patches Cc: palmer , "richard.sandiford" , "kito.cheng" , "philipp.tomsich" , "christoph.muellner" , "rdapp.gcc" , "juzhe.zhong" , "jinma.contrib" References: <20230919084444.2089-1-jinma@linux.alibaba.com> <0635e268-2181-437d-a1e9-a8f9c27d90bd@gmail.com> <06e35f34-2301-4a60-8dae-797925e88c0c@gmail.com> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/25/23 04:15, Jin Ma wrote: >>>>> +;; The conversion of DF to BF needs to be done with SF if there is a >>>>> +;; chance to generate at least one instruction, otherwise just using >>>>> +;; libfunc __truncdfbf2. >>>>> +(define_expand "truncdfbf2" >>>>> + [(set (match_operand:BF 0 "register_operand" "=f") >>>>> + (float_truncate:BF >>>>> + (match_operand:DF 1 "register_operand" " f")))] >>>>> + "TARGET_DOUBLE_FLOAT || TARGET_ZDINX" >>>>> + { >>>>> + convert_move (operands[0], >>>>> + convert_modes (SFmode, DFmode, operands[1], 0), 0); >>>>> + DONE; >>>>> + }) >>>> So for conversions to/from BFmode, doesn't generic code take care of >>>> this for us? Search for convert_mode_scalar in expr.cc. That code will >>>> utilize SFmode as an intermediate step just like your expander. Is >>>> there some reason that generic code is insufficient? >>>> >>>> Similarly for the the other conversions. >>> >>> As far as I can see, the function 'convert_mode_scalar' doesn't seem to be perfect for >>> dealing with the conversions to/from BFmode. It can only handle BF to HF, SF, DF and >>> SF to BF well, but the rest of the conversion without any processing, directly using >>> the libcall. >>> >>> Maybe I should choose to enhance its functionality? This seems to be a >>> good choice, I'm not sure.My recollection was that BF could be converted to/from SF trivially and >> if we wanted BF->DF we'd first convert to SF, then to DF. >> >> Direct BF<->DF conversions aren't actually important from a performance >> standpoint. So it's OK if they have an extra step IMHO. > > Thank you very much for your review and detailed reply. Maybe there are some problems with my expression > and I am a little confused about your guidance. My understanding is that you also think that it is reasonable to > convert through SF, right? In fact, this is what I did. My point was that I would expect the generic code to handle the conversion and that we didn't need to handle it explicitly in the RISC-V backend. Meaning that I don't think we need a define_expand for truncdfbf2, fix_truncbf2, fixuns_truncbf2, floatbf2, or floatunsbf2. > > In this patch, my thoughts are as follows: > > The general principle is to use the real instructions instead of libcall as much as possible for conversions, > while minimizing the definition of libcall(only reusing which has been defined by other architectures such > as aarch64). If SF can be used as a transit, it is preferred to convert to SF, otherwise libcall is directly used. > > 1. For the conversions between floating points > > For BF->DF, as you said, the function 'convert_mode_scalar' in the general code has been well implemented, > which will be expressed as BF->SF->DF. And the generated instruction list may be as follows: > 'call __extendbfsf2' + 'call __extendsfdf2' (when only soft floating point support); > 'call __extendbfsf2' + 'fcvt.d.s' (when (TARGET_DOUBLE_FLOAT || TARGET_ZDINX) is true); > 'fcvt.s.bf16' + 'fcvt.d.s' (when ((TARGET_DOUBLE_FLOAT || TARGET_ZDINX) && TARGET_ZFBFMIN) is true) > > For DF->BF, if any of fcvt.s.d and fcvt.bf16.s cannot be generated, the 'call __truncdfbf2' is directly generated > by the function 'convert_mode_scalar'. Otherwise the new pattern(define_expand "truncdfbf2") is used. This > makes it possible to implement DF->BF by 'fcvt.s.d' + 'fcvt.bf16.s', which cannot be generated by the function > 'convert_mode_scala'. But I would have expected convert_mode_scalar to generate DF->BF by first truncating to SF, then to BF. If that is missing for truncation, then we should add it to convert_mode_scalar rather than expressing it as a backend expander. > > 2. For the conversions between integer and BF, it seems that gcc only uses libcall to implement it, but this is > obviously wrong. For example, the conversion BF->SI directly calls the unimplemented libcall __fixunsbfsi. > So I added some new pattern to handle these transformations with SF. I would suggest these move into target independent code as well. There's no reason I'm aware of that they should be implemented entirely in a target machine description. We're not really doing anything target specific in here. jeff