From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id 3C56B3858C56 for ; Mon, 20 Nov 2023 00:47:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3C56B3858C56 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3C56B3858C56 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=89.208.246.23 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700441267; cv=none; b=kr7W/Prb6ieXjJYuxTc3s8Uk+gVeStT5jVSEIL/dUwNWC7w6xVb8cLQlEhb8uU6YuEhIARzxAGVkhKDb1o589GQVIOEYpmCxxWNtvv28+HZMq8BtMHpMSuJ/oePeYVw+/xRPIRMNmYE0KR+jDyjGoyKm3bFqMTvcGSWdzz+hMHA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700441267; c=relaxed/simple; bh=bnXfpznNkueDb2xfaiLfKuIFHslzed+tmBCvxj8s+hA=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=uWozx2gxG9huaEjwEww6tRWj2upFlzx6cuDbXtZwuCStEqYnSdYzRpCq4xweIrKwfVQBaT6kkjLaKSw3DiIfE/a84QpGXlwU439BQHMowiHNMk7GyHdGdaXXoV+MCAyI+cO3uleDrbK3vranS3KR4K8R5ZkUUX6XseKFMIxdroM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1700441264; bh=bnXfpznNkueDb2xfaiLfKuIFHslzed+tmBCvxj8s+hA=; h=From:To:Cc:Subject:Date:From; b=QTWAvPA9wZe13qTuWVN2yqpiKKyokCyJoz0cYacKHbOWyhJKb3OcT+AiiCk60mDX+ BS1pFPAcI8NYhYP3tnzSvgQKRlGl9P235n6pVVzx6nB4wCaeKdgS8SV1xDn3dW6DsR 6xGmB57+6ZIf7qIBDzP/sS83ynke0ZApRJ9jrtkA= Received: from stargazer.. (unknown [IPv6:240e:358:11b1:2500:dc73:854d:832e:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id ADE8466B39; Sun, 19 Nov 2023 19:47:38 -0500 (EST) From: Xi Ruoyao To: gcc-patches@gcc.gnu.org Cc: chenglulu , i@xen0n.name, xuchenghua@loongson.cn, Xi Ruoyao Subject: [PATCH v3 0/5] LoongArch: SIMD fixes and optimizations Date: Mon, 20 Nov 2023 08:47:23 +0800 Message-ID: <20231120004728.205167-1-xry111@xry111.site> X-Mailer: git-send-email 2.42.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,LIKELY_SPAM_FROM,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: The [1/5] patch is the PR112578 fix at https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637097.html. It has been changed to remove the nearbyint pattern (because nearbyint should not raise FE_INEXACT even if -ffp-int-builtin-inexact). As other patches depending on the simd.md file introduced by this, sending it as the first of this series. As many LASX instructions are only differentiated from the corresponding LSX instruction with operand length, create simd.md file to contain the RTX templates sharable by LSX and LASX. This makes the code cleaner and easier to maintain. The [2/5] and [3/5] patches make vector product highpart and rotate shift operations for GNU vectors and auto vectorization. The [4/5] patch is a simple code cleanup, with no function change. The [5/5] patch uses LSX for FP scalar rounding operations if LSX is available and -ffp-int-builtin-exact. We do this because the base FP ISA does not have such instructions. Using LSX is overkill, but still much faster than calling libc functions. Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk? Xi Ruoyao (5): LoongArch: Fix usage of LSX and LASX frint/ftint instructions [PR112578] LoongArch: Use standard pattern name and RTX code for LSX/LASX muh instructions LoongArch: Use standard pattern name and RTX code for LSX/LASX rotate shift LoongArch: Remove lrint_allow_inexact LoongArch: Use LSX for scalar FP rounding with explicit rounding mode gcc/config/loongarch/lasx.md | 283 ----------------- gcc/config/loongarch/loongarch-builtins.cc | 52 ++-- gcc/config/loongarch/loongarch.md | 12 +- gcc/config/loongarch/lsx.md | 293 ------------------ gcc/config/loongarch/simd.md | 268 ++++++++++++++++ .../loongarch/vect-frint-no-inexact.c | 48 +++ .../loongarch/vect-frint-scalar-no-inexact.c | 23 ++ .../gcc.target/loongarch/vect-frint-scalar.c | 43 +++ .../gcc.target/loongarch/vect-frint.c | 85 +++++ .../loongarch/vect-ftint-no-inexact.c | 44 +++ .../gcc.target/loongarch/vect-ftint.c | 83 +++++ gcc/testsuite/gcc.target/loongarch/vect-muh.c | 36 +++ .../gcc.target/loongarch/vect-rotr.c | 36 +++ 13 files changed, 701 insertions(+), 605 deletions(-) create mode 100644 gcc/config/loongarch/simd.md create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-no-inexact.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-scalar-no-inexact.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint-scalar.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-frint.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-ftint-no-inexact.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-ftint.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-muh.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-rotr.c -- 2.42.1