From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id D57363858D33 for ; Thu, 16 Nov 2023 13:19:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D57363858D33 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D57363858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=89.208.246.23 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700140763; cv=none; b=eBwdXYD24vQjdZFjfGIpEDIrKqwZjqgpwFg8HIdKhRdzrD4tcZ/YxvTreGGyj8S/EofOCxM0qmP0yG5lAubwUOiGW3bh21eh2tgI7zHybfaSNEttA+a2A3MuWBBrtVtpXHijmU+mosHIYx84RsepN9/7Zf5oV6UvNF2r5sXrJhU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700140763; c=relaxed/simple; bh=NkN0laRgSa3LKY/uYMz2hCPpDAejQ470Gb4yqT8JT0Q=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=dg2VNO0Hx/buIL4H3wi1n/nKkLsihCmpbJcVoV7OYAXNHWeHhbiMnjA/2Owvju2fTCzHFx88/Bu8WF7rBELdsdVYLwGcmrYxuEdlw/cL7uibHxmXX5l7xsr4uKGh0yMQjy4Iv68ZZh+4QeRk26vBBldDXpW1MeQQwkhzuXjSoYI= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1700140753; bh=NkN0laRgSa3LKY/uYMz2hCPpDAejQ470Gb4yqT8JT0Q=; h=From:To:Cc:Subject:Date:From; b=BFqyE/jW0VqgrNt8w08DpkXaDp138vYJ+qRhl9TF7OVLgq6XqGoNbiP+7/LmWOhFk QwDhfgiugsoFoo6GWw2H/ZkehxKRIwjdr/zJ3oQgFAqP9gT6JuNRIV4H3466jEVew8 jgmcwIaLO0ewno9BWZ5G1dXFr8Df2Y+aD1r6qfxU= Received: from stargazer.. (unknown [113.200.174.70]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id E6F3D66A03; Thu, 16 Nov 2023 08:19:11 -0500 (EST) From: Xi Ruoyao To: gcc-patches@gcc.gnu.org Cc: chenglulu , i@xen0n.name, xuchenghua@loongson.cn, Xi Ruoyao Subject: [PATCH 0/5] LoongArch: Initial LA664 support Date: Thu, 16 Nov 2023 21:18:32 +0800 Message-ID: <20231116131836.504699-2-xry111@xry111.site> X-Mailer: git-send-email 2.42.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,LIKELY_SPAM_FROM,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Loongson 3A6000 processor will be shipped to general users in this month and it features 4 cores with the new LA664 micro architecture. Here is some changes from LA464: 1. The 32-bit division instruction now ignores the high 32 bits of the input registers. This is enumerated via CPUCFG word 0x2, bit 26. 2. The micro architecture now guarantees two loads on the same memory address won't be reordered with each other. dbar 0x700 is turned into nop. 3. The architecture now supports approximate square root instructions (FRECIPE and VRSQRTE) on 32-bit or 64-bit floating-point values and the vectors of these values. 4. The architecture now supports SC.Q instruction for 128-bit CAS. 5. The architecture now supports LL.ACQ and SC.REL instructions (well, I don't really know what they are for). 6. The architecture now supports CAS instructions for 64, 32, 16, or 8-bit values. 7. The architecture now supports atomic add and atomic swap instructions for 16 or 8-bit values. 8. Some non-zero hint values of DBAR instructions are added. These features are documented in LoongArch v1.1. Implementations can implement any subset of them and enumerate the implemented features via CPUCFG. LA664 implements them all. (8) is already implemented in previous patches because it's completely backward-compatible. This series implements (1) and (2) with switches -mdiv32 and -mld-seq-sa (these names are derived from the names of the corresponding CPUCFG bits documented in the LoongArch v1.1 specification). The other features require Binutils support and we are close to the end of GCC 14 stage 1, so I'm posting this series first now. With -march=la664, these two options are implicitly enabled but they can be turned off with -mno-div32 or -mno-ld-seq-sa. With -march=native, the current CPU is probed via CPUCFG and these options are implicitly enabled if the CPU supports the corresponding feature. They can be turned off with explicit -mno-div32 or -mno-ld-seq-sa as well. -mtune=la664 is implemented as a copy of -mtune=la464 and we can adjust it with benchmark results later. Bootstrapped and regtested on a LA664 with BOOT_CFLAGS="-march=la664 -O2", a LA464 with BOOT_CFLAGS="-march=native -O2". And manually verified -march=native probing on LA664 and LA464. Xi Ruoyao (5): LoongArch: Switch loongarch-def to C++ LoongArch: genopts: Add infrastructure to generate code for new features in ISA evolution LoongArch: Take the advantage of -mdiv32 if it's enabled LoongArch: Don't emit dbar 0x700 if -mld-seq-sa LoongArch: Add -march=la664 and -mtune=la664 gcc/config/loongarch/genopts/genstr.sh | 78 ++++++- gcc/config/loongarch/genopts/isa-evolution.in | 2 + .../loongarch/genopts/loongarch-strings | 1 + gcc/config/loongarch/genopts/loongarch.opt.in | 10 + gcc/config/loongarch/loongarch-cpu.cc | 37 ++-- gcc/config/loongarch/loongarch-cpucfg-map.h | 36 +++ gcc/config/loongarch/loongarch-def-array.h | 40 ++++ gcc/config/loongarch/loongarch-def.c | 205 ------------------ gcc/config/loongarch/loongarch-def.cc | 193 +++++++++++++++++ gcc/config/loongarch/loongarch-def.h | 67 ++++-- gcc/config/loongarch/loongarch-opts.h | 9 +- gcc/config/loongarch/loongarch-str.h | 8 +- gcc/config/loongarch/loongarch-tune.h | 123 ++++++++++- gcc/config/loongarch/loongarch.cc | 6 +- gcc/config/loongarch/loongarch.md | 31 ++- gcc/config/loongarch/loongarch.opt | 23 +- gcc/config/loongarch/t-loongarch | 25 ++- .../gcc.target/loongarch/div-div32.c | 31 +++ .../gcc.target/loongarch/div-no-div32.c | 11 + 19 files changed, 664 insertions(+), 272 deletions(-) create mode 100644 gcc/config/loongarch/genopts/isa-evolution.in create mode 100644 gcc/config/loongarch/loongarch-cpucfg-map.h create mode 100644 gcc/config/loongarch/loongarch-def-array.h delete mode 100644 gcc/config/loongarch/loongarch-def.c create mode 100644 gcc/config/loongarch/loongarch-def.cc create mode 100644 gcc/testsuite/gcc.target/loongarch/div-div32.c create mode 100644 gcc/testsuite/gcc.target/loongarch/div-no-div32.c -- 2.42.1