From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x112d.google.com (mail-yw1-x112d.google.com [IPv6:2607:f8b0:4864:20::112d]) by sourceware.org (Postfix) with ESMTPS id B2E353858D20 for ; Thu, 10 Aug 2023 06:32:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B2E353858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yw1-x112d.google.com with SMTP id 00721157ae682-589addee1c1so4141677b3.3 for ; Wed, 09 Aug 2023 23:32:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691649150; x=1692253950; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=graQVSSl/BtSJbTL+CiXOGIjpnsehyR6rPvsJydeHtw=; b=mVUmBrD3UEpJYOngALwMVXJw3RiCmITKnIatm1wXVydL2cGiqFJH0ucvkn6DqGyJNz TsSiWXJeaq8TZ1IARatB5Km30xCXseANoOJDrNlltIWf+sYBTpFl5f7ammvdPLOv4Zh2 ITaIosyh4avgbKv3RFZwz4pDK8mL1LnnC9mMjegSDPSuDYW1APW4cDCqpU9V7hzMyQUj QCpPKh8rK75EmR2smu60+sqysPj2vAXfgbnZJc27H0x1c4+wOkvLL/nbyedED0OkEgXy DH9ZtcRIPrmV+WEueiJyxXiO/w6fde7xUHZqSavOeEzOQbgWL9RhgbF4QJB3kZKX0/TX sb7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691649150; x=1692253950; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=graQVSSl/BtSJbTL+CiXOGIjpnsehyR6rPvsJydeHtw=; b=lMbmLDgal6mIk14meo/tGpOMKEWSvhZWEpZhOhx45nX902BWg52IV0aR5+39vJ0U2v 8BZ4lZSdhP1a4MFI0I2lNKoDLh0XXP2fAaNP28QMxRH4D7G6U5Ttle7dx1XAYr3ACs7m RKTMQaJlHNpyWsFh02un7KghjGFdCipdKcFLOv2o8R6OTQJzHxHidkA+kVpJ2Gn547qV KWlYOYdwE+KFzzz1R7gHrsm2QZP4lgXLASkv/cYK4Vyd2w3WW3g4gc0QosmgfNyw4GWx 7brIksJ3PDFB/CVe6nf6X/Ac8R2BNY9WG2rSU1HNWXm14ByyDv46PhqwOn9KYipiUHxh bQ5Q== X-Gm-Message-State: AOJu0YxdaAfcWSCUUcDCj1S6ck6mbpivCnPtl4QuCuAcQnezCjfrLEr/ MMNuZ0c2eEZUhEVAn0VfA4cPK6X18bAiYN4EuYw= X-Google-Smtp-Source: AGHT+IH9vO/9EejRcWXJi7l9nzNpB8nY4P1MpwZt+bWo2xAJxMZIDCJvWirUiZSB+KpciC7bs71HQqn1q9OqOmoDrHI= X-Received: by 2002:a25:706:0:b0:d63:18e5:c4bb with SMTP id 6-20020a250706000000b00d6318e5c4bbmr1674333ybh.63.1691649149970; Wed, 09 Aug 2023 23:32:29 -0700 (PDT) MIME-Version: 1.0 References: <20230810004728.15915-1-hongtao.liu@intel.com> In-Reply-To: From: Hongtao Liu Date: Thu, 10 Aug 2023 14:32:18 +0800 Message-ID: Subject: Re: [PATCH] i386: Do not sanitize upper part of V2HFmode and V4HFmode reg with -fno-trapping-math [PR110832] To: Uros Bizjak Cc: liuhongt , gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Aug 10, 2023 at 2:06=E2=80=AFPM Hongtao Liu wr= ote: > > On Thu, Aug 10, 2023 at 2:01=E2=80=AFPM Uros Bizjak via Gcc-patches > wrote: > > > > On Thu, Aug 10, 2023 at 2:49=E2=80=AFAM liuhongt wrote: > > > > > > Also add ix86_partial_vec_fp_math to to condition of V2HF/V4HF named > > > patterns in order to avoid generation of partial vector V8HFmode > > > trapping instructions. > > > > > > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,} > > > Ok for trunk? > > > > > > gcc/ChangeLog: > > > > > > PR target/110832 > > > * config/i386/mmx.md: (movq__to_sse): Also do not > > > sanitize upper part of V4HFmode register with > > > -fno-trapping-math. > > > (v4hf3): Enable for ix86_partial_vec_fp_math. > > > ( > > (v2hf3): Ditto. > > > (divv2hf3): Ditto. > > > (movd_v2hf_to_sse): Do not sanitize upper part of V2HFmode > > > register with -fno-trapping-math. > > > > OK. > > > > BTW: I would just like to mention that plenty of instructions can be > > enabled for V4HF/V2HFmode besides arithmetic insns. At least > > conversions, comparisons, FMA and min/max (to name some of them) can > > be enabled by introducing expanders that expand to V8HFmode > > instruction. > Yes, try to support that in GCC14. I would wait for avx10's patch to go in first, so as to avoid extra rebases and conflicts. > > > > Uros. > > > > > > --- > > > gcc/config/i386/mmx.md | 20 ++++++++++++++------ > > > 1 file changed, 14 insertions(+), 6 deletions(-) > > > > > > diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md > > > index d51b3b9dc71..170432a7128 100644 > > > --- a/gcc/config/i386/mmx.md > > > +++ b/gcc/config/i386/mmx.md > > > @@ -596,7 +596,7 @@ (define_expand "movq__to_sse" > > > (match_dup 2)))] > > > "TARGET_SSE2" > > > { > > > - if (mode =3D=3D V2SFmode > > > + if (mode !=3D V2SImode > > > && !flag_trapping_math) > > > { > > > rtx op1 =3D force_reg (mode, operands[1]); > > > @@ -1941,7 +1941,7 @@ (define_expand "v4hf3" > > > (plusminusmult:V4HF > > > (match_operand:V4HF 1 "nonimmediate_operand") > > > (match_operand:V4HF 2 "nonimmediate_operand")))] > > > - "TARGET_AVX512FP16 && TARGET_AVX512VL" > > > + "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math" > > > { > > > rtx op2 =3D gen_reg_rtx (V8HFmode); > > > rtx op1 =3D gen_reg_rtx (V8HFmode); > > > @@ -1961,7 +1961,7 @@ (define_expand "divv4hf3" > > > (div:V4HF > > > (match_operand:V4HF 1 "nonimmediate_operand") > > > (match_operand:V4HF 2 "nonimmediate_operand")))] > > > - "TARGET_AVX512FP16 && TARGET_AVX512VL" > > > + "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math" > > > { > > > rtx op2 =3D gen_reg_rtx (V8HFmode); > > > rtx op1 =3D gen_reg_rtx (V8HFmode); > > > @@ -1983,14 +1983,22 @@ (define_expand "movd_v2hf_to_sse" > > > (match_operand:V2HF 1 "nonimmediate_operand")) > > > (match_operand:V8HF 2 "reg_or_0_operand") > > > (const_int 3)))] > > > - "TARGET_SSE") > > > + "TARGET_SSE" > > > +{ > > > + if (!flag_trapping_math && operands[2] =3D=3D CONST0_RTX (V8HFmode= )) > > > + { > > > + rtx op1 =3D force_reg (V2HFmode, operands[1]); > > > + emit_move_insn (operands[0], lowpart_subreg (V8HFmode, op1, V2HF= mode)); > > > + DONE; > > > + } > > > +}) > > > > > > (define_expand "v2hf3" > > > [(set (match_operand:V2HF 0 "register_operand") > > > (plusminusmult:V2HF > > > (match_operand:V2HF 1 "nonimmediate_operand") > > > (match_operand:V2HF 2 "nonimmediate_operand")))] > > > - "TARGET_AVX512FP16 && TARGET_AVX512VL" > > > + "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math" > > > { > > > rtx op2 =3D gen_reg_rtx (V8HFmode); > > > rtx op1 =3D gen_reg_rtx (V8HFmode); > > > @@ -2009,7 +2017,7 @@ (define_expand "divv2hf3" > > > (div:V2HF > > > (match_operand:V2HF 1 "nonimmediate_operand") > > > (match_operand:V2HF 2 "nonimmediate_operand")))] > > > - "TARGET_AVX512FP16 && TARGET_AVX512VL" > > > + "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math" > > > { > > > rtx op2 =3D gen_reg_rtx (V8HFmode); > > > rtx op1 =3D gen_reg_rtx (V8HFmode); > > > -- > > > 2.31.1 > > > > > > > -- > BR, > Hongtao --=20 BR, Hongtao