From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id E39163858C55 for ; Thu, 13 Oct 2022 21:35:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E39163858C55 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665696911; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=27+a7/23dCtQgeRIHJPh8fwVv1lTV+SrbIU4nhtCoiA=; b=HSYYXgbY9vUx5WEPAO+iLlAWhEcvIcdAuBzzuMS2r653tPzcGGO6TA0WCPEgK4bDAZf5EO W0RaewvVXiCAgMWjSHNKXr2pCkav2K8cEC1N71JHj64/SjuJ9mDnvCA2A1+QkjS98F+5P1 uAHXOD0H4NJ8gOP73RhXftP5LYg9I4A= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-481-2k5M701iO7GZuYFC5d4-Zw-1; Thu, 13 Oct 2022 17:35:08 -0400 X-MC-Unique: 2k5M701iO7GZuYFC5d4-Zw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EBD12296A603; Thu, 13 Oct 2022 21:35:07 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.55]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 907AD145BA4E; Thu, 13 Oct 2022 21:35:07 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 29DLZ4fs3027824 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 13 Oct 2022 23:35:04 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 29DLZ22r3024421; Thu, 13 Oct 2022 23:35:02 +0200 Date: Thu, 13 Oct 2022 23:35:01 +0200 From: Jakub Jelinek To: Uros Bizjak Cc: Jason Merrill , "Joseph S. Myers" , Richard Biener , Jeff Law , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] middle-end, c++, i386, libgcc, v2: std::bfloat16_t and __bf16 arithmetic support Message-ID: Reply-To: Jakub Jelinek References: <37522634-319a-b471-aa35-87e711b0479e@redhat.com> <55062a15-79a1-f8cf-ed20-25ca8ff42abe@redhat.com> <95f2abba-afb4-bb73-a9f0-b1578b28713a@redhat.com> <5598547f-ce63-6b4d-31e4-a15f57b8f224@redhat.com> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Oct 13, 2022 at 11:11:53PM +0200, Uros Bizjak wrote: > > > + do_compare_rtx_and_jump (op1, op2, GET_CODE (operands[0]), 0, > > > + SFmode, NULL_RTX, NULL, > > > + as_a (operands[3]), > > > + /* Unfortunately this isn't propagated. */ > > > + profile_probability::even ()); > > You could use ix86_expand_branch instead of do_compare_rtx_and_jump > here. This would expand in SFmode, so insn condition from cbranchsf4 > should be copied here: > > "TARGET_80387 || (SSE_FLOAT_MODE_P (SFmode) && TARGET_SSE_MATH)" > > Additionally, ix86_fp_comparison_operator predicate should be used for > operator0. Basically, just copy predicates from cbranchsf4 as we are > effectively expanding the SFmode compare & branch. The reason why I've used there the generic routine was exactly to handle not just ix86_fp_comparison_operator, but also comparisons that are more complex than that (need 2 comparisons). While for ix86_fp_comparison_operator cases the optabs wouldn't be actually strictly needed, the generic code would see e.g. cbranchbf4 isn't supported and try cbranchsf4, succeed on that and the only disadvantage would be that the BFmode -> SFmode extensions would be performed using library functions unless -ffast-math while they can be handled by left shifting the 16 BFmode bits to most significant 16 bits of SFmode even when honoring NaNs, for the non-ix86_fp_comparison_operator cases the generic behavior is actually that neither cbranchbf4, nor cbranchsf4, nor cbranchdf4, nor cbranchxf4, nor cbranchtf4 works out and generic code emits a libcall (__{eq,ne}bf2). I bet that is the reason why libgcc contains __{eq,ne}hf2 entrypoints. I wanted to avoid adding __{eq,ne}bf2 and the addition of cbranchbf4/cstorebf4 was how I managed to do that; by telling the generic code that it can handle those by the faster BFmode to SFmode conversions of the operands and then perform one or two bit checks. I guess another possibility would be to call ix86_expand_branch there once or twice and repeat what the generic code does, or add the libgcc entrypoints which would perhaps bypass soft-fp and just do the shifts + SFmode comparison. > > > + else > > > + { > > > + rtx t2 = gen_reg_rtx (SImode); > > > + emit_insn (gen_zero_extendhisi2 (t2, op2)); > > > + emit_insn (gen_ashlsi3 (t2, t2, GEN_INT (16))); > > > + op2 = gen_lowpart (SFmode, t2); > > > + } > > Similar to cbranch above, use ix86_expand_setcc and copy predicates > from cstoresf4. Ditto here, cstore was actually quite required by the generic code when cbranch is implemented. Jakub