public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v8 00/12] Add LoongArch support.
@ 2022-03-04  7:17 xuchenghua
  2022-03-04  7:17 ` [PATCH v8 01/12] LoongArch Port: Regenerate configure xuchenghua
                   ` (13 more replies)
  0 siblings, 14 replies; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:17 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

Hi, all:

This is the v8 version of LoongArch Port. Please review.

We know it is stage4, I think it is ok for a new prot.
The kernel side upstream waiting for a approval by gcc side,
if it is blocked by stage4, a approval for GCC13 will be appreciation.

The LoongArch architecture (LoongArch) is an Instruction Set
Architecture (ISA) that has a Reduced Instruction Set Computer (RISC)
style.
The documents are on
https://loongson.github.io/LoongArch-Documentation/README-EN.html

The ELF ABI Documents are on:
https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html

The binutils has been merged into trunk:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=560b3fe208255ae909b4b1c88ba9c28b09043307

Note: We split -mabi= into -mabi=lp64d/f/s, the new options not support by upstream binutils yet,
this GCC port requires the following patch applied to binutils to build.
https://github.com/loongson/binutils-gdb/commit/aacb0bf860f02aa5a7dcb76dd0e392bf871c7586
(will be submitted to upstream after gcc side comfirmed)

We have compiled more than 300 CLFS packages with this compiler.
The CLFS are currently used on Cfarm machines gcc400 and gcc401.

changelog:

v1 -> v2
1. Split patch set.
2. Change some code style.
3. Add -mabi=lp64d/f/s options.
4. Change GLIBC_DYNAMIC_LINKER_LP64 name.

v2 -> v3
1. Change some code style.
2. Bug fix.

v3 -> v4
1. Change some code style.
2. Bug fix.
3. Delete some builtin macros.

v4 -> v5
1. delete wrong insn zero_extendsidi2_internal.
2. Adjust some build options.
3. Change some .c files to .cc.

v5 -> v6
1. Fix compilation issues. The generated files *.opt and *.h
   are generated to $(objdir).

v6 -> v7
1. Bug fix.
2. Change some code style.

v7 -> v8
1. Add new addressing type ADDRESS_REG_REG support.
2. Modify documentation.
3. Eliminate compile-time warnings.

chenglulu (12):
  LoongArch Port: Regenerate configure
  LoongArch Port: gcc build
  LoongArch Port: Regenerate gcc/configure.
  LoongArch Port: Machine description files.
  LoongArch Port: Machine description C files and .h files.
  LoongArch Port: Builtin functions.
  LoongArch Port: Builtin macros.
  LoongArch Port: libgcc
  LoongArch Port: Regenerate libgcc/configure.
  LoongArch Port: libgomp
  LoongArch Port: gcc/testsuite
  LoongArch Port: Add doc.

 config/picflag.m4                             |    3 +
 configure                                     |   10 +-
 configure.ac                                  |   10 +-
 contrib/config-list.mk                        |    5 +-
 contrib/gcc_update                            |    2 +
 .../config/loongarch/loongarch-common.cc      |   73 +
 gcc/config.gcc                                |  410 +-
 gcc/config/host-linux.cc                      |    2 +
 gcc/config/loongarch/constraints.md           |  204 +
 gcc/config/loongarch/generic.md               |  132 +
 gcc/config/loongarch/genopts/genstr.sh        |   91 +
 .../loongarch/genopts/loongarch-strings       |   58 +
 gcc/config/loongarch/genopts/loongarch.opt.in |  189 +
 gcc/config/loongarch/gnu-user.h               |   84 +
 gcc/config/loongarch/la464.md                 |  132 +
 gcc/config/loongarch/larchintrin.h            |  413 ++
 gcc/config/loongarch/linux.h                  |   50 +
 gcc/config/loongarch/loongarch-builtins.cc    |  511 ++
 gcc/config/loongarch/loongarch-c.cc           |  109 +
 gcc/config/loongarch/loongarch-cpu.cc         |  206 +
 gcc/config/loongarch/loongarch-cpu.h          |   30 +
 gcc/config/loongarch/loongarch-def.c          |  164 +
 gcc/config/loongarch/loongarch-def.h          |  151 +
 gcc/config/loongarch/loongarch-driver.cc      |  187 +
 gcc/config/loongarch/loongarch-driver.h       |   69 +
 gcc/config/loongarch/loongarch-ftypes.def     |  106 +
 gcc/config/loongarch/loongarch-modes.def      |   29 +
 gcc/config/loongarch/loongarch-opts.cc        |  580 ++
 gcc/config/loongarch/loongarch-opts.h         |   86 +
 gcc/config/loongarch/loongarch-protos.h       |  240 +
 gcc/config/loongarch/loongarch-str.h          |   57 +
 gcc/config/loongarch/loongarch-tune.h         |   72 +
 gcc/config/loongarch/loongarch.cc             | 6450 +++++++++++++++++
 gcc/config/loongarch/loongarch.h              | 1273 ++++
 gcc/config/loongarch/loongarch.md             | 3712 ++++++++++
 gcc/config/loongarch/loongarch.opt            |  189 +
 gcc/config/loongarch/predicates.md            |  527 ++
 gcc/config/loongarch/sync.md                  |  574 ++
 gcc/config/loongarch/t-linux                  |   53 +
 gcc/config/loongarch/t-loongarch              |   68 +
 gcc/configure                                 |   66 +-
 gcc/configure.ac                              |   33 +-
 gcc/doc/install.texi                          |   47 +-
 gcc/doc/invoke.texi                           |  202 +
 gcc/doc/md.texi                               |   55 +
 gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C    |    2 +-
 gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C   |    2 +-
 gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C   |    2 +-
 gcc/testsuite/gcc.dg/20020312-2.c             |    2 +
 gcc/testsuite/gcc.dg/loop-8.c                 |    2 +-
 .../torture/stackalign/builtin-apply-2.c      |    2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c     |    2 +-
 .../gcc.target/loongarch/loongarch.exp        |   40 +
 .../gcc.target/loongarch/tst-asm-const.c      |   16 +
 gcc/testsuite/go.test/go-test.exp             |    3 +
 gcc/testsuite/lib/target-supports.exp         |   14 +
 libgcc/config.host                            |   28 +-
 libgcc/config/loongarch/crtfastmath.c         |   52 +
 libgcc/config/loongarch/crti.S                |   43 +
 libgcc/config/loongarch/crtn.S                |   39 +
 libgcc/config/loongarch/linux-unwind.h        |   80 +
 libgcc/config/loongarch/sfp-machine.h         |  152 +
 libgcc/config/loongarch/t-crtstuff            |    5 +
 libgcc/config/loongarch/t-loongarch           |    7 +
 libgcc/config/loongarch/t-loongarch64         |    1 +
 libgcc/config/loongarch/t-softfp-tf           |    3 +
 libgcc/configure                              |    5 +-
 libgcc/configure.ac                           |    2 +-
 libgomp/configure.tgt                         |    4 +
 69 files changed, 18194 insertions(+), 28 deletions(-)
 create mode 100644 gcc/common/config/loongarch/loongarch-common.cc
 create mode 100644 gcc/config/loongarch/constraints.md
 create mode 100644 gcc/config/loongarch/generic.md
 create mode 100755 gcc/config/loongarch/genopts/genstr.sh
 create mode 100644 gcc/config/loongarch/genopts/loongarch-strings
 create mode 100644 gcc/config/loongarch/genopts/loongarch.opt.in
 create mode 100644 gcc/config/loongarch/gnu-user.h
 create mode 100644 gcc/config/loongarch/la464.md
 create mode 100644 gcc/config/loongarch/larchintrin.h
 create mode 100644 gcc/config/loongarch/linux.h
 create mode 100644 gcc/config/loongarch/loongarch-builtins.cc
 create mode 100644 gcc/config/loongarch/loongarch-c.cc
 create mode 100644 gcc/config/loongarch/loongarch-cpu.cc
 create mode 100644 gcc/config/loongarch/loongarch-cpu.h
 create mode 100644 gcc/config/loongarch/loongarch-def.c
 create mode 100644 gcc/config/loongarch/loongarch-def.h
 create mode 100644 gcc/config/loongarch/loongarch-driver.cc
 create mode 100644 gcc/config/loongarch/loongarch-driver.h
 create mode 100644 gcc/config/loongarch/loongarch-ftypes.def
 create mode 100644 gcc/config/loongarch/loongarch-modes.def
 create mode 100644 gcc/config/loongarch/loongarch-opts.cc
 create mode 100644 gcc/config/loongarch/loongarch-opts.h
 create mode 100644 gcc/config/loongarch/loongarch-protos.h
 create mode 100644 gcc/config/loongarch/loongarch-str.h
 create mode 100644 gcc/config/loongarch/loongarch-tune.h
 create mode 100644 gcc/config/loongarch/loongarch.cc
 create mode 100644 gcc/config/loongarch/loongarch.h
 create mode 100644 gcc/config/loongarch/loongarch.md
 create mode 100644 gcc/config/loongarch/loongarch.opt
 create mode 100644 gcc/config/loongarch/predicates.md
 create mode 100644 gcc/config/loongarch/sync.md
 create mode 100644 gcc/config/loongarch/t-linux
 create mode 100644 gcc/config/loongarch/t-loongarch
 create mode 100644 gcc/testsuite/gcc.target/loongarch/loongarch.exp
 create mode 100644 gcc/testsuite/gcc.target/loongarch/tst-asm-const.c
 create mode 100644 libgcc/config/loongarch/crtfastmath.c
 create mode 100644 libgcc/config/loongarch/crti.S
 create mode 100644 libgcc/config/loongarch/crtn.S
 create mode 100644 libgcc/config/loongarch/linux-unwind.h
 create mode 100644 libgcc/config/loongarch/sfp-machine.h
 create mode 100644 libgcc/config/loongarch/t-crtstuff
 create mode 100644 libgcc/config/loongarch/t-loongarch
 create mode 100644 libgcc/config/loongarch/t-loongarch64
 create mode 100644 libgcc/config/loongarch/t-softfp-tf

-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 01/12] LoongArch Port: Regenerate configure
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
@ 2022-03-04  7:17 ` xuchenghua
  2022-03-04  7:17 ` [PATCH v8 02/12] LoongArch Port: gcc build xuchenghua
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:17 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

        * config/picflag.m4: Default add build option '-fpic' for LoongArch.
        * configure: Add LoongArch tuples.
        * configure.ac: Like wise.
---
 config/picflag.m4 |  3 +++
 configure         | 10 +++++++++-
 configure.ac      | 10 +++++++++-
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/config/picflag.m4 b/config/picflag.m4
index 8b106f9af88..0aefcf619bf 100644
--- a/config/picflag.m4
+++ b/config/picflag.m4
@@ -44,6 +44,9 @@ case "${$2}" in
 	# sets the default TLS model and affects inlining.
 	$1=-fPIC
 	;;
+    loongarch*-*-*)
+	$1=-fpic
+	;;
     mips-sgi-irix6*)
 	# PIC is the default.
 	;;
diff --git a/configure b/configure
index 9c2d7df1bb2..87548f0da96 100755
--- a/configure
+++ b/configure
@@ -3060,7 +3060,7 @@ case "${ENABLE_GOLD}" in
       # Check for target supported by gold.
       case "${target}" in
         i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
-        | aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-*)
+        | aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-* | loongarch*-*-*)
 	  configdirs="$configdirs gold"
 	  if test x${ENABLE_GOLD} = xdefault; then
 	    default_ld=gold
@@ -3646,6 +3646,9 @@ case "${target}" in
   i[3456789]86-*-*)
     libgloss_dir=i386
     ;;
+  loongarch*-*-*)
+    libgloss_dir=loongarch
+    ;;
   m68hc11-*-*|m6811-*-*|m68hc12-*-*|m6812-*-*)
     libgloss_dir=m68hc11
     ;;
@@ -4030,6 +4033,11 @@ case "${target}" in
   wasm32-*-*)
     noconfigdirs="$noconfigdirs ld"
     ;;
+  loongarch*-*-linux*)
+    ;;
+  loongarch*-*-*)
+    noconfigdirs="$noconfigdirs gprof"
+    ;;
 esac
 
 # If we aren't building newlib, then don't build libgloss, since libgloss
diff --git a/configure.ac b/configure.ac
index 68cc5cc31fe..55362afeeae 100644
--- a/configure.ac
+++ b/configure.ac
@@ -353,7 +353,7 @@ case "${ENABLE_GOLD}" in
       # Check for target supported by gold.
       case "${target}" in
         i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
-        | aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-*)
+        | aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-* | loongarch*-*-*)
 	  configdirs="$configdirs gold"
 	  if test x${ENABLE_GOLD} = xdefault; then
 	    default_ld=gold
@@ -899,6 +899,9 @@ case "${target}" in
   i[[3456789]]86-*-*)
     libgloss_dir=i386
     ;;
+  loongarch*-*-*)
+    libgloss_dir=loongarch
+    ;;
   m68hc11-*-*|m6811-*-*|m68hc12-*-*|m6812-*-*)
     libgloss_dir=m68hc11
     ;;
@@ -1283,6 +1286,11 @@ case "${target}" in
   wasm32-*-*)
     noconfigdirs="$noconfigdirs ld"
     ;;
+  loongarch*-*-linux*)
+    ;;
+  loongarch*-*-*)
+    noconfigdirs="$noconfigdirs gprof"
+    ;;
 esac
 
 # If we aren't building newlib, then don't build libgloss, since libgloss
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 02/12] LoongArch Port: gcc build
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
  2022-03-04  7:17 ` [PATCH v8 01/12] LoongArch Port: Regenerate configure xuchenghua
@ 2022-03-04  7:17 ` xuchenghua
  2022-03-06 10:29   ` Richard Sandiford
  2022-03-04  7:18 ` [PATCH v8 03/12] LoongArch Port: Regenerate gcc/configure xuchenghua
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:17 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

gcc/

        * common/config/loongarch/loongarch-common.cc: New file.
        * config/loongarch/genopts/genstr.sh: New file.
        * config/loongarch/genopts/loongarch-strings: New file.
        * config/loongarch/genopts/loongarch.opt.in: New file.
	* config/loongarch/loongarch-str.h: New file.
        * config/loongarch/gnu-user.h: New file.
        * config/loongarch/linux.h: New file.
        * config/loongarch/loongarch-cpu.cc: New file.
        * config/loongarch/loongarch-cpu.h: New file.
	* config/loongarch/loongarch-def.c: New file.
	* config/loongarch/loongarch-def.h: New file.
        * config/loongarch/loongarch-driver.cc: New file.
        * config/loongarch/loongarch-driver.h: New file.
        * config/loongarch/loongarch-opts.cc: New file.
        * config/loongarch/loongarch-opts.h: New file.
        * config/loongarch/loongarch.opt: New file.
        * config/loongarch/t-linux: New file.
        * config/loongarch/t-loongarch: New file.
	* gcc_update (files_and_dependencies): Add
	config/loongarch/loongarch.opt and config/loongarch/loongarch-str.h.
        * config.gcc: Add LoongArch support.
        * configure.ac: Add LoongArch support.
---
 contrib/gcc_update                            |   2 +
 .../config/loongarch/loongarch-common.cc      |  73 +++
 gcc/config.gcc                                | 410 ++++++++++++-
 gcc/config/loongarch/genopts/genstr.sh        |  91 +++
 .../loongarch/genopts/loongarch-strings       |  58 ++
 gcc/config/loongarch/genopts/loongarch.opt.in | 189 ++++++
 gcc/config/loongarch/gnu-user.h               |  84 +++
 gcc/config/loongarch/linux.h                  |  50 ++
 gcc/config/loongarch/loongarch-cpu.cc         | 206 +++++++
 gcc/config/loongarch/loongarch-cpu.h          |  30 +
 gcc/config/loongarch/loongarch-def.c          | 164 +++++
 gcc/config/loongarch/loongarch-def.h          | 151 +++++
 gcc/config/loongarch/loongarch-driver.cc      | 187 ++++++
 gcc/config/loongarch/loongarch-driver.h       |  69 +++
 gcc/config/loongarch/loongarch-opts.cc        | 580 ++++++++++++++++++
 gcc/config/loongarch/loongarch-opts.h         |  86 +++
 gcc/config/loongarch/loongarch-str.h          |  57 ++
 gcc/config/loongarch/loongarch.opt            | 189 ++++++
 gcc/config/loongarch/t-linux                  |  53 ++
 gcc/config/loongarch/t-loongarch              |  68 ++
 gcc/configure.ac                              |  33 +-
 21 files changed, 2825 insertions(+), 5 deletions(-)
 create mode 100644 gcc/common/config/loongarch/loongarch-common.cc
 create mode 100755 gcc/config/loongarch/genopts/genstr.sh
 create mode 100644 gcc/config/loongarch/genopts/loongarch-strings
 create mode 100644 gcc/config/loongarch/genopts/loongarch.opt.in
 create mode 100644 gcc/config/loongarch/gnu-user.h
 create mode 100644 gcc/config/loongarch/linux.h
 create mode 100644 gcc/config/loongarch/loongarch-cpu.cc
 create mode 100644 gcc/config/loongarch/loongarch-cpu.h
 create mode 100644 gcc/config/loongarch/loongarch-def.c
 create mode 100644 gcc/config/loongarch/loongarch-def.h
 create mode 100644 gcc/config/loongarch/loongarch-driver.cc
 create mode 100644 gcc/config/loongarch/loongarch-driver.h
 create mode 100644 gcc/config/loongarch/loongarch-opts.cc
 create mode 100644 gcc/config/loongarch/loongarch-opts.h
 create mode 100644 gcc/config/loongarch/loongarch-str.h
 create mode 100644 gcc/config/loongarch/loongarch.opt
 create mode 100644 gcc/config/loongarch/t-linux
 create mode 100644 gcc/config/loongarch/t-loongarch

diff --git a/contrib/gcc_update b/contrib/gcc_update
index 1cf15f9b3c2..641ce164775 100755
--- a/contrib/gcc_update
+++ b/contrib/gcc_update
@@ -86,6 +86,8 @@ gcc/config/arm/arm-tables.opt: gcc/config/arm/arm-cpus.in gcc/config/arm/parsecp
 gcc/config/c6x/c6x-tables.opt: gcc/config/c6x/c6x-isas.def gcc/config/c6x/genopt.sh
 gcc/config/c6x/c6x-sched.md: gcc/config/c6x/c6x-sched.md.in gcc/config/c6x/gensched.sh
 gcc/config/c6x/c6x-mult.md: gcc/config/c6x/c6x-mult.md.in gcc/config/c6x/genmult.sh
+gcc/config/loongarch/loongarch-str.h: gcc/config/loongarch/genopts/genstr.sh gcc/config/loongarch/genopts/loongarch-string
+gcc/config/loongarch/loongarch.opt: gcc/config/loongarch/genopts/genstr.sh gcc/config/loongarch/genopts/loongarch.opt.in
 gcc/config/m68k/m68k-tables.opt: gcc/config/m68k/m68k-devices.def gcc/config/m68k/m68k-isas.def gcc/config/m68k/m68k-microarchs.def gcc/config/m68k/genopt.sh
 gcc/config/mips/mips-tables.opt: gcc/config/mips/mips-cpus.def gcc/config/mips/genopt.sh
 gcc/config/rs6000/rs6000-tables.opt: gcc/config/rs6000/rs6000-cpus.def gcc/config/rs6000/genopt.sh
diff --git a/gcc/common/config/loongarch/loongarch-common.cc b/gcc/common/config/loongarch/loongarch-common.cc
new file mode 100644
index 00000000000..5bdfd2a30e1
--- /dev/null
+++ b/gcc/common/config/loongarch/loongarch-common.cc
@@ -0,0 +1,73 @@
+/* Common hooks for LoongArch.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "common/common-target.h"
+#include "common/common-target-def.h"
+#include "opts.h"
+#include "flags.h"
+#include "diagnostic-core.h"
+
+#undef	TARGET_OPTION_OPTIMIZATION_TABLE
+#define TARGET_OPTION_OPTIMIZATION_TABLE loongarch_option_optimization_table
+
+/* Set default optimization options.  */
+static const struct default_options loongarch_option_optimization_table[] =
+{
+  { OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
+  { OPT_LEVELS_NONE, 0, NULL, 0 }
+};
+
+/* Implement TARGET_HANDLE_OPTION.  */
+
+static bool
+loongarch_handle_option (struct gcc_options *opts,
+			 struct gcc_options *opts_set ATTRIBUTE_UNUSED,
+			 const struct cl_decoded_option *decoded,
+			 location_t loc ATTRIBUTE_UNUSED)
+{
+  size_t code = decoded->opt_index;
+  int value = decoded->value;
+
+  switch (code)
+    {
+    case OPT_mmemcpy:
+      if (value)
+	{
+	  if (opts->x_optimize_size)
+	    opts->x_target_flags |= MASK_MEMCPY;
+	}
+      else
+	opts->x_target_flags &= ~MASK_MEMCPY;
+      return true;
+
+    default:
+      return true;
+    }
+}
+
+#undef TARGET_DEFAULT_TARGET_FLAGS
+#define TARGET_DEFAULT_TARGET_FLAGS	MASK_CHECK_ZERO_DIV
+#undef TARGET_HANDLE_OPTION
+#define TARGET_HANDLE_OPTION loongarch_handle_option
+
+struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 3833bfa16a9..b42ceb6e4fc 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -454,6 +454,13 @@ mips*-*-*)
 	extra_objs="frame-header-opt.o"
 	extra_options="${extra_options} g.opt fused-madd.opt mips/mips-tables.opt"
 	;;
+loongarch*-*-*)
+	cpu_type=loongarch
+	extra_headers="larchintrin.h"
+	extra_objs="loongarch-c.o loongarch-builtins.o loongarch-cpu.o loongarch-opts.o loongarch-def.o"
+	extra_gcc_objs="loongarch-driver.o loongarch-cpu.o loongarch-opts.o loongarch-def.o"
+	extra_options="${extra_options} g.opt fused-madd.opt"
+	;;
 nds32*)
 	cpu_type=nds32
 	extra_headers="nds32_intrinsic.h nds32_isr.h nds32_init.inc"
@@ -2495,6 +2502,20 @@ riscv*-*-freebsd*)
 	# automatically detect that GAS supports it, yet we require it.
 	gcc_cv_initfini_array=yes
 	;;
+
+loongarch*-*-linux*)
+	tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h glibc-stdint.h ${tm_file}"
+	tm_file="${tm_file} loongarch/gnu-user.h loongarch/linux.h"
+	extra_options="${extra_options} linux-android.opt"
+	tmake_file="${tmake_file} loongarch/t-linux"
+	gnu_ld=yes
+	gas=yes
+
+	# Force .init_array support.  The configure script cannot always
+	# automatically detect that GAS supports it, yet we require it.
+	gcc_cv_initfini_array=yes
+	;;
+
 mips*-*-netbsd*)			# NetBSD/mips, either endian.
 	target_cpu_default="MASK_ABICALLS"
 	tm_file="elfos.h ${tm_file} mips/elf.h ${nbsd_tm_file} mips/netbsd.h"
@@ -3601,7 +3622,7 @@ case ${target} in
         ;;
 *-*-linux* | *-*-gnu*)
 	case ${target} in
-	aarch64*-* | arm*-* | i[34567]86-* | powerpc*-* | s390*-* | sparc*-* | x86_64-*)
+	aarch64*-* | arm*-* | i[34567]86-* | powerpc*-* | s390*-* | sparc*-* | x86_64-* | loongarch*-*)
 		default_gnu_indirect_function=yes
 		;;
 	esac
@@ -4933,6 +4954,348 @@ case "${target}" in
 		esac
 		;;
 
+	loongarch*-*-*)
+		supported_defaults="abi arch tune fpu"
+
+		# Local variables
+		unset \
+			abi_pattern      abi_default    \
+			abiext_pattern   abiext_default \
+			arch_pattern     arch_default   \
+			fpu_pattern      fpu_default    \
+			tune_pattern     tune_default   \
+			triplet_os       triplet_abi
+
+		# Inferring ABI from the triplet.
+		case ${target} in
+		loongarch64-*-*-*f64)
+			abi_pattern="lp64d"
+			triplet_abi="f64"
+			;;
+		loongarch64-*-*-*f32)
+			abi_pattern="lp64f"
+			triplet_abi="f32"
+			;;
+		loongarch64-*-*-*sf)
+			abi_pattern="lp64s"
+			triplet_abi="sf"
+			;;
+		loongarch64-*-*-*)
+			abi_pattern="lp64[dfs]"
+			abi_default="lp64d"
+			triplet_abi=""
+			;;
+		*)
+			echo "Unsupported target ${target}." 1>&2
+			exit 1
+			;;
+		esac
+
+		abiext_pattern="*"
+		abiext_default="base"
+		triplet_abi+=""
+
+		# Getting the canonical triplet (multiarch specifier).
+		case ${target} in
+		  *-linux-gnu*)  triplet_os="linux-gnu";;
+		  *-linux-musl*) triplet_os="linux-musl";;
+		  *)
+			  echo "Unsupported target ${target}." 1>&2
+			  exit 1
+			  ;;
+		esac
+
+		la_canonical_triplet="loongarch64-${triplet_os}${triplet_abi}"
+
+		## Setting default value for with_abi.
+		case ${with_abi} in
+		"")
+			if test x${abi_default} != x; then
+				with_abi=${abi_default}
+			else
+				with_abi=${abi_pattern}
+			fi
+			;;
+
+		*)
+			if echo "${with_abi}" | grep -E "^${abi_pattern}$"; then
+				: # OK
+			else
+				echo "Incompatible options:" \
+				"--with-abi=${with_abi} and --target=${target}." 1>&2
+				exit 1
+			fi
+			;;
+		esac
+
+		## Setting default value for with_abiext (internal)
+		case ${with_abiext} in
+		"")
+			if test x${abiext_default} != x; then
+				with_abiext=${abiext_default}
+			else
+				with_abiext=${abiext_pattern}
+			fi
+			;;
+
+		*)
+			if echo "${with_abiext}" | grep -E "^${abiext_pattern}$"; then
+				: # OK
+			else
+				echo "The ABI extension type \"${with_abiext}\"" \
+				"is incompatible with --target=${target}." 1>&2
+				exit 1
+			fi
+
+			;;
+		esac
+
+
+		# Inferring ISA-related default options from the ABI: pass 1
+		case ${with_abi}/${with_abiext} in
+		lp64*/base)
+			# architectures that support lp64* ABI
+			arch_pattern="native|loongarch64|la464"
+			# default architecture for lp64* ABI
+			arch_default="loongarch64"
+			;;
+		*)
+			echo "Unsupported ABI type ${with_abi}/${with_abiext}." 1>&2
+			exit 1
+			;;
+		esac
+
+		# Inferring ISA-related default options from the ABI: pass 2
+		case ${with_abi}/${with_abiext} in
+		lp64d/base)
+			fpu_pattern="64"
+			;;
+		lp64f/base)
+			fpu_pattern="32|64"
+			fpu_default="32"
+			;;
+		lp64s/base)
+			fpu_pattern="none|32|64"
+			fpu_default="none"
+			;;
+		*)
+			echo "Unsupported ABI type ${with_abi}/${with_abiext}." 1>&2
+			exit 1
+			;;
+		esac
+
+		## Setting default value for with_arch.
+		case ${with_arch} in
+		"")
+			if test x${arch_default} != x; then
+				with_arch=${arch_default}
+			else
+				with_arch=${arch_pattern}
+			fi
+			;;
+
+		*)
+			if echo "${with_arch}" | grep -E "^${arch_pattern}$"; then
+				: # OK
+			else
+				echo "${with_abi}/${with_abiext} ABI cannot be implemented with" \
+				"--with-arch=${with_arch}." 1>&2
+				exit 1
+			fi
+			;;
+		esac
+
+		## Setting default value for with_fpu.
+		if test x${with_fpu} == xnone; then
+			with_fpu="0"
+		fi
+
+		case ${with_fpu} in
+		"")
+			if test x${fpu_default} != x; then
+				with_fpu=${fpu_default}
+			else
+				with_fpu=${fpu_pattern}
+			fi
+			;;
+
+		*)
+			if echo "${with_fpu}" | grep -E "^${fpu_pattern}$"; then
+				: # OK
+			else
+				echo "${with_abi}/${with_abiext} ABI cannot be implemented with" \
+				"--with-fpu=${with_fpu}." 1>&2
+				exit 1
+			fi
+			;;
+		esac
+
+
+		# Inferring default with_tune from with_arch: pass 1
+		case ${with_arch} in
+		native)
+			tune_pattern="*"
+			tune_default="native"
+			;;
+		loongarch64)
+			tune_pattern="loongarch64 | la464"
+			tune_default="la464"
+			;;
+		*)
+			# By default, $with_tune == $with_arch
+			tune_pattern="$with_arch"
+			;;
+		esac
+
+		## Setting default value for with_tune.
+		case ${with_tune} in
+		"")
+			if test x${tune_default} != x; then
+				with_tune=${tune_default}
+			else
+				with_tune=${tune_pattern}
+			fi
+			;;
+
+		*)
+			if echo "${with_tune}" | grep -E "^${tune_pattern}$"; then
+				: # OK
+			else
+				echo "Incompatible options: --with-tune=${with_tune}" \
+				"and --with-arch=${with_arch}." 1>&2
+				exit 1
+			fi
+			;;
+		esac
+
+
+		# Perform final sanity checks.
+		case ${with_arch} in
+		loongarch64 | la464) ;; # OK, append here.
+		native)
+			if test x${host} != x${target}; then
+				echo "--with-arch=native is illegal for cross-compiler." 1>&2
+				exit 1
+			fi
+			;;
+		"")
+			echo "Please set a default value for \${with_arch}" \
+			     "according to your target triplet \"${target}\"." 1>&2
+			exit 1
+			;;
+		*)
+			echo "Unknown arch in --with-arch=$with_arch" 1>&2
+			exit 1
+			;;
+		esac
+
+		case ${with_abi} in
+		lp64d | lp64f | lp64s) ;; # OK, append here.
+		*)
+			echo "Unsupported ABI given in --with-abi=$with_abi" 1>&2
+			exit 1
+			;;
+		esac
+
+		case ${with_abiext} in
+		base) ;; # OK, append here.
+		*)
+			echo "Unsupported ABI extention type $with_abiext" 1>&2
+			exit 1
+			;;
+		esac
+
+		case ${with_fpu} in
+		none | 32 | 64) ;; # OK, append here.
+		*)
+			echo "Unknown fpu type in --with-fpu=$with_fpu" 1>&2
+			exit 1
+			;;
+		esac
+
+		# Handle --with-multilib-list.
+		if test x${with_multilib_list} == x \
+		   || test x${with_multilib_list} == xno \
+		   || test x${with_multilib_list} == xdefault \
+		   || test x${enable_multilib} != xyes; then
+
+			with_multilib_list="${with_abi}/${with_abiext}"
+		fi
+
+		# Check if the configured default ABI combination is included in
+		# ${with_multilib_list}.
+		loongarch_multilib_list_sane=no
+
+		# This one goes to TM_MULTILIB_CONFIG, for use in t-linux.
+		loongarch_multilib_list_make=""
+
+		# This one goes to tm_defines, for use in loongarch-driver.c.
+		loongarch_multilib_list_c=""
+
+		for e in $(tr ',' ' ' <<< "${with_multilib_list}"); do
+			e=($(tr '/' ' ' <<< "$e"))
+
+			# Base ABI type
+			case ${e[0]} in
+			lp64d) loongarch_multilib_list_c+="ABI_BASE_LP64D,";;
+			lp64f) loongarch_multilib_list_c+="ABI_BASE_LP64F,";;
+			lp64s) loongarch_multilib_list_c+="ABI_BASE_LP64S,";;
+			*)
+				echo "Unknown base ABI \"${e[0]}\" in --with-multilib-list." 1>&2
+				exit 1
+				;;
+			esac
+			loongarch_multilib_list_make+="mabi=${e[0]}"
+
+			# ABI extension type
+			case ${e[1]} in
+			"" | base)
+				loongarch_multilib_list_make+=""
+				loongarch_multilib_list_c+="ABI_EXT_BASE,"
+				e[1]="base"
+				;;
+			*)
+				echo "Unknown ABI extension \"${e[1]}\" in --with-multilib-list." 1>&2
+				exit 1
+				;;
+			esac
+
+			case ${e[2]} in
+			"") ;; # OK
+			*)
+				echo "Unknown ABI in --with-multilib-list." 1>&2
+				exit 1
+				;;
+			esac
+
+			if test x${with_abi} != x && test x${with_abiext} != x; then
+				if test x${e[0]} == x${with_abi} \
+				&& test x${e[1]} == x${with_abiext}; then
+					loongarch_multilib_list_sane=yes
+				fi
+			fi
+
+			loongarch_multilib_list_make+=","
+		done
+
+		# Check if the default ABI combination is in the default list.
+		if test x${loongarch_multilib_list_sane} == xno; then
+			if test x${with_abiext} == xbase; then
+				with_abiext=""
+			else
+				with_abiext="/${with_abiext}"
+			fi
+
+			echo "Default ABI combination (${with_abi}${with_abiext})" \
+			"not found in --with-multilib-list." 1>&2
+			exit 1
+		fi
+
+		# Remove the excessive appending comma.
+		loongarch_multilib_list_c=${loongarch_multilib_list_c:0:-1}
+		loongarch_multilib_list_make=${loongarch_multilib_list_make:0:-1}
+		;;
+
 	nds32*-*-*)
 		supported_defaults="arch cpu nds32_lib float fpu_config"
 
@@ -5370,6 +5733,51 @@ case ${target} in
 		tmake_file="mips/t-mips $tmake_file"
 		;;
 
+	loongarch*-*-*)
+		# Export canonical triplet.
+		tm_defines+=" LA_MULTIARCH_TRIPLET=${la_canonical_triplet}"
+
+		# Define macro __DISABLE_MULTILIB if --disable-multilib
+		tm_defines+=" TM_MULTILIB_LIST=${loongarch_multilib_list_c}"
+		if test x$enable_multilib == xyes; then
+			TM_MULTILIB_CONFIG="${loongarch_multilib_list_make}"
+		else
+			tm_defines+=" __DISABLE_MULTILIB"
+		fi
+
+		# Let --with- flags initialize the enum variables from loongarch.opt.
+		# See macro definitions from loongarch-opts.h and loongarch-cpu.h.
+		case ${with_arch} in
+		native)		tm_defines+=" DEFAULT_CPU_ARCH=CPU_NATIVE" ;;
+		la464)		tm_defines+=" DEFAULT_CPU_ARCH=CPU_LA464" ;;
+		loongarch64)	tm_defines+=" DEFAULT_CPU_ARCH=CPU_LOONGARCH64" ;;
+		esac
+
+		case ${with_tune} in
+		native)		tm_defines+=" DEFAULT_CPU_TUNE=CPU_NATIVE" ;;
+		la464)		tm_defines+=" DEFAULT_CPU_TUNE=CPU_LA464" ;;
+		loongarch64)	tm_defines+=" DEFAULT_CPU_TUNE=CPU_LOONGARCH64" ;;
+		esac
+
+		case ${with_abi} in
+		lp64d)     tm_defines+=" DEFAULT_ABI_BASE=ABI_BASE_LP64D" ;;
+		lp64f)     tm_defines+=" DEFAULT_ABI_BASE=ABI_BASE_LP64F" ;;
+		lp64s)     tm_defines+=" DEFAULT_ABI_BASE=ABI_BASE_LP64S" ;;
+		esac
+
+		case ${with_abiext} in
+		base)      tm_defines+=" DEFAULT_ABI_EXT=ABI_EXT_BASE" ;;
+		esac
+
+		case ${with_fpu} in
+		none)    tm_defines="$tm_defines DEFAULT_ISA_EXT_FPU=ISA_EXT_NOFPU" ;;
+		32)      tm_defines="$tm_defines DEFAULT_ISA_EXT_FPU=ISA_EXT_FPU32" ;;
+		64)      tm_defines="$tm_defines DEFAULT_ISA_EXT_FPU=ISA_EXT_FPU64" ;;
+		esac
+
+		tmake_file="loongarch/t-loongarch $tmake_file"
+		;;
+
 	powerpc*-*-* | rs6000-*-*)
 		# FIXME: The PowerPC port uses the value set at compile time,
 		# although it's only cosmetic.
diff --git a/gcc/config/loongarch/genopts/genstr.sh b/gcc/config/loongarch/genopts/genstr.sh
new file mode 100755
index 00000000000..6c1211c9e92
--- /dev/null
+++ b/gcc/config/loongarch/genopts/genstr.sh
@@ -0,0 +1,91 @@
+#!/bin/sh
+# A simple script that generates loongarch-str.h and loongarch.opt
+# from genopt/loongarch-optstr.
+#
+# Copyright (C) 2021-2022 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 3, or (at your option) any later
+# version.
+#
+# GCC is distributed in the hope that it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+# or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+# License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+cd "$(dirname "$0")"
+
+# Generate a header containing definitions from the string table.
+gen_defines() {
+    cat <<EOF
+/* Generated automatically by "genstr" from "loongarch-strings".
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef LOONGARCH_STR_H
+#define LOONGARCH_STR_H
+EOF
+
+    sed -e '/^$/n' -e 's@#.*$@@' -e '/^$/d' \
+	-e 's@^\([^ \t]\+\)[ \t]*\([^ \t]*\)@#define \1 "\2"@' \
+	loongarch-strings
+
+    echo
+    echo "#endif /* LOONGARCH_STR_H */"
+}
+
+
+# Substitute all "@@<KEY>@@" to "<VALUE>" in loongarch.opt.in
+# according to the key-value pairs defined in loongarch-strings.
+
+gen_options() {
+
+    sed -e '/^$/n' -e 's@#.*$@@' -e '/^$/d' \
+	-e 's@^\([^ \t]\+\)[ \t]*\([^ \t]*\)@\1="\2"@' \
+	loongarch-strings | { \
+
+	# read the definitions
+	while read -r line; do
+	    eval "$line"
+	done
+
+	# make the substitutions
+	sed -e 's@"@\\"@g' -e 's/@@\([^@]\+\)@@/${\1}/g' loongarch.opt.in | \
+	    while read -r line; do
+		eval "echo \"$line\""
+	    done
+    }
+}
+
+main() {
+    case "$1" in
+	header) gen_defines;;
+	opt) gen_options;;
+	*) echo "Unknown Command: \"$1\". Available: header, opt"; exit 1;;
+    esac
+}
+
+main "$@"
diff --git a/gcc/config/loongarch/genopts/loongarch-strings b/gcc/config/loongarch/genopts/loongarch-strings
new file mode 100644
index 00000000000..cb88ed56b68
--- /dev/null
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -0,0 +1,58 @@
+# Defines the key strings for LoongArch compiler options.
+#
+# Copyright (C) 2021-2022 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 3, or (at your option) any later
+# version.
+#
+# GCC is distributed in the hope that it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+# or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+# License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# -march= / -mtune=
+OPTSTR_ARCH	      arch
+OPTSTR_TUNE	      tune
+
+STR_CPU_NATIVE	      native
+STR_CPU_LOONGARCH64   loongarch64
+STR_CPU_LA464	      la464
+
+# Base architecture
+STR_ISA_BASE_LA64V100 la64
+
+# -mfpu
+OPTSTR_ISA_EXT_FPU    fpu
+STR_ISA_EXT_NOFPU     none
+STR_ISA_EXT_FPU0      0
+STR_ISA_EXT_FPU32     32
+STR_ISA_EXT_FPU64     64
+
+OPTSTR_SOFT_FLOAT     soft-float
+OPTSTR_SINGLE_FLOAT   single-float
+OPTSTR_DOUBLE_FLOAT   double-float
+
+# -mabi=
+OPTSTR_ABI_BASE	      abi
+STR_ABI_BASE_LP64D    lp64d
+STR_ABI_BASE_LP64F    lp64f
+STR_ABI_BASE_LP64S    lp64s
+
+# ABI extension types
+STR_ABI_EXT_BASE      base
+
+# -mcmodel=
+OPTSTR_CMODEL	      cmodel
+STR_CMODEL_NORMAL     normal
+STR_CMODEL_TINY	      tiny
+STR_CMODEL_TS	      tiny-static
+STR_CMODEL_LARGE      large
+STR_CMODEL_EXTREME    extreme
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in b/gcc/config/loongarch/genopts/loongarch.opt.in
new file mode 100644
index 00000000000..01c8b3e5308
--- /dev/null
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -0,0 +1,189 @@
+; Generated by "genstr" from the template "loongarch.opt.in"
+; and definitions from "loongarch-strings".
+;
+; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;
+; This file is part of GCC.
+;
+; GCC is free software; you can redistribute it and/or modify it under
+; the terms of the GNU General Public License as published by the Free
+; Software Foundation; either version 3, or (at your option) any later
+; version.
+;
+; GCC is distributed in the hope that it will be useful, but WITHOUT
+; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+; License for more details.
+;
+; You should have received a copy of the GNU General Public License
+; along with GCC; see the file COPYING3.  If not see
+; <http://www.gnu.org/licenses/>.
+;
+
+; Variables (macros) that should be exported by loongarch.opt:
+;   la_opt_switches,
+;   la_opt_abi_base, la_opt_abi_ext,
+;   la_opt_cpu_arch, la_opt_cpu_tune,
+;   la_opt_fpu,
+;   la_cmodel.
+
+HeaderInclude
+config/loongarch/loongarch-opts.h
+
+HeaderInclude
+config/loongarch/loongarch-str.h
+
+Variable
+HOST_WIDE_INT la_opt_switches = 0
+
+; ISA related options
+;; Base ISA
+Enum
+Name(isa_base) Type(int)
+Basic ISAs of LoongArch:
+
+EnumValue
+Enum(isa_base) String(@@STR_ISA_BASE_LA64V100@@) Value(ISA_BASE_LA64V100)
+
+
+;; ISA extensions / adjustments
+Enum
+Name(isa_ext_fpu) Type(int)
+FPU types of LoongArch:
+
+EnumValue
+Enum(isa_ext_fpu) String(@@STR_ISA_EXT_NOFPU@@) Value(ISA_EXT_NOFPU)
+
+EnumValue
+Enum(isa_ext_fpu) String(@@STR_ISA_EXT_FPU32@@) Value(ISA_EXT_FPU32)
+
+EnumValue
+Enum(isa_ext_fpu) String(@@STR_ISA_EXT_FPU64@@) Value(ISA_EXT_FPU64)
+
+m@@OPTSTR_ISA_EXT_FPU@@=
+Target RejectNegative Joined ToLower Enum(isa_ext_fpu) Var(la_opt_fpu) Init(M_OPTION_NOT_SEEN)
+-m@@OPTSTR_ISA_EXT_FPU@@=FPU	Generate code for the given FPU.
+
+m@@OPTSTR_ISA_EXT_FPU@@=@@STR_ISA_EXT_FPU0@@
+Target RejectNegative Alias(m@@OPTSTR_ISA_EXT_FPU@@=,@@STR_ISA_EXT_NOFPU@@)
+
+m@@OPTSTR_SOFT_FLOAT@@
+Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_SOFTF) Negative(m@@OPTSTR_SINGLE_FLOAT@@)
+Prevent the use of all hardware floating-point instructions.
+
+m@@OPTSTR_SINGLE_FLOAT@@
+Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F32) Negative(m@@OPTSTR_DOUBLE_FLOAT@@)
+Restrict the use of hardware floating-point instructions to 32-bit operations.
+
+m@@OPTSTR_DOUBLE_FLOAT@@
+Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F64) Negative(m@@OPTSTR_SOFT_FLOAT@@)
+Allow hardware floating-point instructions to cover both 32-bit and 64-bit operations.
+
+
+;; Base target models (implies ISA & tune parameters)
+Enum
+Name(cpu_type) Type(int)
+LoongArch CPU types:
+
+EnumValue
+Enum(cpu_type) String(@@STR_CPU_NATIVE@@) Value(CPU_NATIVE)
+
+EnumValue
+Enum(cpu_type) String(@@STR_CPU_LOONGARCH64@@) Value(CPU_LOONGARCH64)
+
+EnumValue
+Enum(cpu_type) String(@@STR_CPU_LA464@@) Value(CPU_LA464)
+
+m@@OPTSTR_ARCH@@=
+Target RejectNegative Joined Enum(cpu_type) Var(la_opt_cpu_arch) Init(M_OPTION_NOT_SEEN)
+-m@@OPTSTR_ARCH@@=PROCESSOR	Generate code for the given PROCESSOR ISA.
+
+m@@OPTSTR_TUNE@@=
+Target RejectNegative Joined Enum(cpu_type) Var(la_opt_cpu_tune) Init(M_OPTION_NOT_SEEN)
+-m@@OPTSTR_TUNE@@=PROCESSOR	Generate optimized code for PROCESSOR.
+
+
+; ABI related options
+; (ISA constraints on ABI are handled dynamically)
+
+;; Base ABI
+Enum
+Name(abi_base) Type(int)
+Base ABI types for LoongArch:
+
+EnumValue
+Enum(abi_base) String(@@STR_ABI_BASE_LP64D@@) Value(ABI_BASE_LP64D)
+
+EnumValue
+Enum(abi_base) String(@@STR_ABI_BASE_LP64F@@) Value(ABI_BASE_LP64F)
+
+EnumValue
+Enum(abi_base) String(@@STR_ABI_BASE_LP64S@@) Value(ABI_BASE_LP64S)
+
+m@@OPTSTR_ABI_BASE@@=
+Target RejectNegative Joined ToLower Enum(abi_base) Var(la_opt_abi_base) Init(M_OPTION_NOT_SEEN)
+-m@@OPTSTR_ABI_BASE@@=BASEABI	Generate code that conforms to the given BASEABI.
+
+;; ABI Extension
+Variable
+int la_opt_abi_ext = M_OPTION_NOT_SEEN
+
+
+mbranch-cost=
+Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
+-mbranch-cost=COST	Set the cost of branches to roughly COST instructions.
+
+mcheck-zero-division
+Target Mask(CHECK_ZERO_DIV)
+Trap on integer divide by zero.
+
+mcond-move-int
+Target Var(TARGET_COND_MOVE_INT) Init(1)
+Conditional moves for integral are enabled.
+
+mcond-move-float
+Target Var(TARGET_COND_MOVE_FLOAT) Init(1)
+Conditional moves for float are enabled.
+
+mmemcpy
+Target Mask(MEMCPY)
+Don't optimize block moves.
+
+mlra
+Target Var(loongarch_lra_flag) Init(1) Save
+Use LRA instead of reload.
+
+noasmopt
+Driver
+
+mstrict-align
+Target Var(TARGET_STRICT_ALIGN) Init(0)
+Do not generate unaligned memory accesses.
+
+mmax-inline-memcpy-size=
+Target Joined RejectNegative UInteger Var(loongarch_max_inline_memcpy_size) Init(1024)
+-mmax-inline-memcpy-size=SIZE	Set the max size of memcpy to inline, default is 1024.
+
+; The code model option names for -mcmodel.
+Enum
+Name(cmodel) Type(int)
+The code model option names for -mcmodel:
+
+EnumValue
+Enum(cmodel) String(@@STR_CMODEL_NORMAL@@) Value(CMODEL_NORMAL)
+
+EnumValue
+Enum(cmodel) String(@@STR_CMODEL_TINY@@) Value(CMODEL_TINY)
+
+EnumValue
+Enum(cmodel) String(@@STR_CMODEL_TS@@) Value(CMODEL_TINY_STATIC)
+
+EnumValue
+Enum(cmodel) String(@@STR_CMODEL_LARGE@@) Value(CMODEL_LARGE)
+
+EnumValue
+Enum(cmodel) String(@@STR_CMODEL_EXTREME@@) Value(CMODEL_EXTREME)
+
+mcmodel=
+Target RejectNegative Joined Enum(cmodel) Var(la_opt_cmodel) Init(CMODEL_NORMAL)
+Specify the code model.
diff --git a/gcc/config/loongarch/gnu-user.h b/gcc/config/loongarch/gnu-user.h
new file mode 100644
index 00000000000..9c186277c62
--- /dev/null
+++ b/gcc/config/loongarch/gnu-user.h
@@ -0,0 +1,84 @@
+/* Definitions for LoongArch systems using GNU (glibc-based) userspace,
+   or other userspace with libc derived from glibc.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* Define the size of the wide character type.  */
+#undef WCHAR_TYPE
+#define WCHAR_TYPE "int"
+
+#undef WCHAR_TYPE_SIZE
+#define WCHAR_TYPE_SIZE 32
+
+
+/* GNU-specific SPEC definitions.  */
+#define GNU_USER_LINK_EMULATION "elf" ABI_GRLEN_SPEC "loongarch"
+
+#undef GLIBC_DYNAMIC_LINKER
+#define GLIBC_DYNAMIC_LINKER \
+  "/lib" ABI_GRLEN_SPEC "/ld-linux-loongarch-" ABI_SPEC ".so.1"
+
+#undef MUSL_DYNAMIC_LINKER
+#define MUSL_DYNAMIC_LINKER \
+  "/lib" ABI_GRLEN_SPEC "/ld-musl-loongarch-" ABI_SPEC ".so.1"
+
+#undef GNU_USER_TARGET_LINK_SPEC
+#define GNU_USER_TARGET_LINK_SPEC \
+  "%{G*} %{shared} -m " GNU_USER_LINK_EMULATION \
+  "%{!shared: %{static} %{!static: %{rdynamic:-export-dynamic} " \
+  "-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}}"
+
+
+/* Similar to standard Linux, but adding -ffast-math support.  */
+#undef GNU_USER_TARGET_MATHFILE_SPEC
+#define GNU_USER_TARGET_MATHFILE_SPEC \
+  "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s}"
+
+#undef LIB_SPEC
+#define LIB_SPEC GNU_USER_TARGET_LIB_SPEC
+
+#undef LINK_SPEC
+#define LINK_SPEC GNU_USER_TARGET_LINK_SPEC
+
+#undef ENDFILE_SPEC
+#define ENDFILE_SPEC \
+  GNU_USER_TARGET_MATHFILE_SPEC " " \
+  GNU_USER_TARGET_ENDFILE_SPEC
+
+#undef SUBTARGET_CPP_SPEC
+#define SUBTARGET_CPP_SPEC "%{posix:-D_POSIX_SOURCE} %{pthread:-D_REENTRANT}"
+
+/* A standard GNU/Linux mapping.  On most targets, it is included in
+   CC1_SPEC itself by config/linux.h, but loongarch.h overrides CC1_SPEC
+   and provides this hook instead.  */
+#undef SUBTARGET_CC1_SPEC
+#define SUBTARGET_CC1_SPEC GNU_USER_TARGET_CC1_SPEC
+
+#define TARGET_OS_CPP_BUILTINS() \
+  do \
+    { \
+      GNU_USER_TARGET_OS_CPP_BUILTINS (); \
+      /* The GNU C++ standard library requires this.  */ \
+      if (c_dialect_cxx ()) \
+       builtin_define ("_GNU_SOURCE"); \
+    } \
+  while (0)
+
+
+#undef ASM_DECLARE_OBJECT_NAME
+#define ASM_DECLARE_OBJECT_NAME loongarch_declare_object_name
diff --git a/gcc/config/loongarch/linux.h b/gcc/config/loongarch/linux.h
new file mode 100644
index 00000000000..cd54a265336
--- /dev/null
+++ b/gcc/config/loongarch/linux.h
@@ -0,0 +1,50 @@
+/* Definitions for Linux-based systems with libraries in ELF format.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* Default system library search paths.
+ * This ensures that a compiler configured with --disable-multilib
+ * can work in a multilib environment.  */
+
+#if defined(__DISABLE_MULTILIB) && defined(__DISABLE_MULTIARCH)
+
+  #if DEFAULT_ABI_BASE == ABI_BASE_LP64D
+    #define ABI_LIBDIR "lib64"
+  #elif DEFAULT_ABI_BASE == ABI_BASE_LP64F
+    #define ABI_LIBDIR "lib64/f32"
+  #elif DEFAULT_ABI_BASE == ABI_BASE_LP64S
+    #define ABI_LIBDIR "lib64/sf"
+  #endif
+
+#endif
+
+#ifndef ABI_LIBDIR
+#define ABI_LIBDIR "lib"
+#endif
+
+#define STANDARD_STARTFILE_PREFIX_1 "/" ABI_LIBDIR "/"
+#define STANDARD_STARTFILE_PREFIX_2 "/usr/" ABI_LIBDIR "/"
+
+
+/* Define this to be nonzero if static stack checking is supported.  */
+#define STACK_CHECK_STATIC_BUILTIN 1
+
+/* The default value isn't sufficient in 64-bit mode.  */
+#define STACK_CHECK_PROTECT (TARGET_64BIT ? 16 * 1024 : 12 * 1024)
+
+#define TARGET_ASM_FILE_END file_end_indicate_exec_stack
diff --git a/gcc/config/loongarch/loongarch-cpu.cc b/gcc/config/loongarch/loongarch-cpu.cc
new file mode 100644
index 00000000000..a886dd9323b
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-cpu.cc
@@ -0,0 +1,206 @@
+/* Definitions for LoongArch CPU properties.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "diagnostic-core.h"
+
+#include "loongarch-opts.h"
+#include "loongarch-cpu.h"
+#include "loongarch-str.h"
+
+/* Native CPU detection with "cpucfg" */
+#define N_CPUCFG_WORDS 0x15
+static uint32_t cpucfg_cache[N_CPUCFG_WORDS] = { 0 };
+static const int cpucfg_useful_idx[] = {0, 1, 2, 16, 17, 18, 19};
+
+static uint32_t
+read_cpucfg_word (int wordno)
+{
+  /* To make cross-compiler shut up.  */
+  (void) wordno;
+  uint32_t ret = 0;
+
+  #ifdef __loongarch__
+  __asm__ __volatile__ ("cpucfg %0,%1\n\t"
+			:"=r"(ret)
+			:"r"(wordno)
+			:);
+  #endif
+
+  return ret;
+}
+
+void
+cache_cpucfg (void)
+{
+  for (unsigned int i = 0; i < sizeof (cpucfg_useful_idx) / sizeof (int); i++)
+    {
+      cpucfg_cache[cpucfg_useful_idx[i]]
+	= read_cpucfg_word (cpucfg_useful_idx[i]);
+    }
+}
+
+uint32_t
+get_native_prid (void)
+{
+  /* Fill loongarch_cpu_default_config[CPU_NATIVE] with cpucfg data,
+     see "Loongson Architecture Reference Manual"
+     (Volume 1, Section 2.2.10.5) */
+  return cpucfg_cache[0];
+}
+
+const char*
+get_native_prid_str (void)
+{
+  static char prid_str[9];
+  sprintf (prid_str, "%08x", cpucfg_cache[0]);
+  return (const char*) prid_str;
+}
+
+/* Fill property tables for CPU_NATIVE.  */
+unsigned int
+fill_native_cpu_config (int p_arch_native, int p_tune_native)
+{
+  int ret_cpu_type;
+
+  /* Nothing needs to be done unless "-march/tune=native"
+     is given or implied.  */
+  if (!(p_arch_native || p_tune_native))
+    return CPU_NATIVE;
+
+  /* Fill cpucfg_cache with the "cpucfg" instruction.  */
+  cache_cpucfg ();
+
+
+  /* Fill: loongarch_cpu_default_isa[CPU_NATIVE].base
+     With: base architecture (ARCH)
+     At:   cpucfg_words[1][1:0] */
+
+  #define NATIVE_BASE_ISA (loongarch_cpu_default_isa[CPU_NATIVE].base)
+  switch (cpucfg_cache[1] & 0x3)
+    {
+      case 0x02:
+	NATIVE_BASE_ISA = ISA_BASE_LA64V100;
+	break;
+
+      default:
+	if (p_arch_native)
+	  fatal_error (UNKNOWN_LOCATION,
+		       "unknown base architecture %<0x%x%>, %qs failed",
+		       (unsigned int) (cpucfg_cache[1] & 0x3),
+		       "-m" OPTSTR_ARCH "=" STR_CPU_NATIVE);
+    }
+
+  /* Fill: loongarch_cpu_default_isa[CPU_NATIVE].fpu
+     With: FPU type (FP, FP_SP, FP_DP)
+     At:   cpucfg_words[2][2:0] */
+
+  #define NATIVE_FPU (loongarch_cpu_default_isa[CPU_NATIVE].fpu)
+  switch (cpucfg_cache[2] & 0x7)
+    {
+      case 0x07:
+	NATIVE_FPU = ISA_EXT_FPU64;
+	break;
+
+      case 0x03:
+	NATIVE_FPU = ISA_EXT_FPU32;
+	break;
+
+      case 0x00:
+	NATIVE_FPU = ISA_EXT_NOFPU;
+	break;
+
+      default:
+	if (p_arch_native)
+	  fatal_error (UNKNOWN_LOCATION,
+		       "unknown FPU type %<0x%x%>, %qs failed",
+		       (unsigned int) (cpucfg_cache[2] & 0x7),
+		       "-m" OPTSTR_ARCH "=" STR_CPU_NATIVE);
+    }
+
+  /* Fill: loongarch_cpu_cache[CPU_NATIVE]
+     With: cache size info
+     At:   cpucfg_words[16:20][31:0] */
+
+  int l1d_present = 0, l1u_present = 0;
+  int l2d_present = 0;
+  uint32_t l1_szword, l2_szword;
+
+  l1u_present |= cpucfg_cache[16] & 3;	      /* bit[1:0]: unified l1 cache */
+  l1d_present |= cpucfg_cache[16] & 4;	      /* bit[2:2]: l1 dcache */
+  l1_szword = l1d_present ? 18 : (l1u_present ? 17 : 0);
+  l1_szword = l1_szword ? cpucfg_cache[l1_szword]: 0;
+
+  l2d_present |= cpucfg_cache[16] & 24;	      /* bit[4:3]: unified l2 cache */
+  l2d_present |= cpucfg_cache[16] & 128;      /* bit[7:7]: l2 dcache */
+  l2_szword = l2d_present ? cpucfg_cache[19]: 0;
+
+  loongarch_cpu_cache[CPU_NATIVE].l1d_line_size
+    = 1 << ((l1_szword & 0x7f000000) >> 24);  /* bit[30:24]: log2(linesize) */
+
+  loongarch_cpu_cache[CPU_NATIVE].l1d_size
+    = (1 << ((l1_szword & 0x00ff0000) >> 16)) /* bit[23:16]: log2(idx) */
+    * ((l1_szword & 0x0000ffff) + 1)	      /* bit[15:0]:  sets - 1 */
+    * (1 << ((l1_szword & 0x7f000000) >> 24)) /* bit[30:24]: log2(linesize) */
+    >> 10;				      /* in kilobytes */
+
+  loongarch_cpu_cache[CPU_NATIVE].l2d_size
+    = (1 << ((l2_szword & 0x00ff0000) >> 16)) /* bit[23:16]: log2(idx) */
+    * ((l2_szword & 0x0000ffff) + 1)	      /* bit[15:0]:  sets - 1 */
+    * (1 << ((l2_szword & 0x7f000000) >> 24)) /* bit[30:24]: log2(linesize) */
+    >> 10;				      /* in kilobytes */
+
+  /* Fill: ret_cpu_type
+     With: processor ID (PRID)
+     At:   cpucfg_words[0][31:0] */
+
+  switch (cpucfg_cache[0] & 0x00ffff00)
+  {
+    case 0x0014c000:   /* LA464 */
+      ret_cpu_type = CPU_LA464;
+      break;
+
+    default:
+      /* Unknown PRID.  This is generally harmless as long as
+	 the properties above can be obtained via "cpucfg".  */
+      if (p_tune_native)
+	inform (UNKNOWN_LOCATION, "unknown processor ID %<0x%x%>, "
+		"some tuning parameters will fall back to default",
+		cpucfg_cache[0]);
+      break;
+  }
+
+  /* Properties that cannot be looked up directly using cpucfg.  */
+  loongarch_cpu_issue_rate[CPU_NATIVE]
+    = loongarch_cpu_issue_rate[ret_cpu_type];
+
+  loongarch_cpu_multipass_dfa_lookahead[CPU_NATIVE]
+    = loongarch_cpu_multipass_dfa_lookahead[ret_cpu_type];
+
+  loongarch_cpu_rtx_cost_data[CPU_NATIVE]
+    = loongarch_cpu_rtx_cost_data[ret_cpu_type];
+
+  return ret_cpu_type;
+}
diff --git a/gcc/config/loongarch/loongarch-cpu.h b/gcc/config/loongarch/loongarch-cpu.h
new file mode 100644
index 00000000000..93d656f70f5
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-cpu.h
@@ -0,0 +1,30 @@
+/* Definitions for loongarch native cpu property detection routines.
+   Copyright (C) 2020-2021 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef LOONGARCH_CPU_H
+#define LOONGARCH_CPU_H
+
+#include "system.h"
+
+void cache_cpucfg (void);
+unsigned int fill_native_cpu_config (int p_arch_native, int p_tune_native);
+uint32_t get_native_prid (void);
+const char* get_native_prid_str (void);
+
+#endif /* LOONGARCH_CPU_H */
diff --git a/gcc/config/loongarch/loongarch-def.c b/gcc/config/loongarch/loongarch-def.c
new file mode 100644
index 00000000000..e41616853a5
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -0,0 +1,164 @@
+/* LoongArch static properties.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "loongarch-def.h"
+#include "loongarch-str.h"
+
+/* CPU property tables.  */
+const char*
+loongarch_cpu_strings[N_TUNE_TYPES] = {
+  [CPU_NATIVE]		  = STR_CPU_NATIVE,
+  [CPU_LOONGARCH64]	  = STR_CPU_LOONGARCH64,
+  [CPU_LA464]		  = STR_CPU_LA464,
+};
+
+struct loongarch_isa
+loongarch_cpu_default_isa[N_ARCH_TYPES] = {
+  [CPU_LOONGARCH64] = {
+      .base = ISA_BASE_LA64V100,
+      .fpu = ISA_EXT_FPU64,
+  },
+  [CPU_LA464] = {
+      .base = ISA_BASE_LA64V100,
+      .fpu = ISA_EXT_FPU64,
+  },
+};
+
+struct loongarch_cache
+loongarch_cpu_cache[N_TUNE_TYPES] = {
+  [CPU_LOONGARCH64] = {
+      .l1d_line_size = 64,
+      .l1d_size = 64,
+      .l2d_size = 256,
+  },
+  [CPU_LA464] = {
+      .l1d_line_size = 64,
+      .l1d_size = 64,
+      .l2d_size = 256,
+  },
+};
+
+/* The following properties cannot be looked up directly using "cpucfg".
+ So it is necessary to provide a default value for "unknown native"
+ tune targets (i.e. -mtune=native while PRID does not correspond to
+ any known "-mtune" type).  */
+
+struct loongarch_rtx_cost_data
+loongarch_cpu_rtx_cost_data[N_TUNE_TYPES] = {
+  [CPU_NATIVE] = {
+      DEFAULT_COSTS
+  },
+  [CPU_LOONGARCH64] = {
+      DEFAULT_COSTS
+  },
+  [CPU_LA464] = {
+      DEFAULT_COSTS
+  },
+};
+
+/* RTX costs to use when optimizing for size.  */
+extern const struct loongarch_rtx_cost_data
+loongarch_rtx_cost_optimize_size = {
+    .fp_add	      = 4,
+    .fp_mult_sf	      = 4,
+    .fp_mult_df	      = 4,
+    .fp_div_sf	      = 4,
+    .fp_div_df	      = 4,
+    .int_mult_si      = 4,
+    .int_mult_di      = 4,
+    .int_div_si	      = 4,
+    .int_div_di	      = 4,
+    .branch_cost      = 2,
+    .memory_latency   = 4,
+};
+
+int
+loongarch_cpu_issue_rate[N_TUNE_TYPES] = {
+  [CPU_NATIVE]	      = 4,
+  [CPU_LOONGARCH64]   = 4,
+  [CPU_LA464]	      = 4,
+};
+
+int
+loongarch_cpu_multipass_dfa_lookahead[N_TUNE_TYPES] = {
+  [CPU_NATIVE]	      = 4,
+  [CPU_LOONGARCH64]   = 4,
+  [CPU_LA464]	      = 4,
+};
+
+/* Wiring string definitions from loongarch-str.h to global arrays
+   with standard index values from loongarch-opts.h, so we can
+   print config-related messages and do ABI self-spec filtering
+   from the driver in a self-consistent manner.  */
+
+const char*
+loongarch_isa_base_strings[N_ISA_BASE_TYPES] = {
+  [ISA_BASE_LA64V100] = STR_ISA_BASE_LA64V100,
+};
+
+const char*
+loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
+  [ISA_EXT_FPU64] = STR_ISA_EXT_FPU64,
+  [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
+  [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
+};
+
+const char*
+loongarch_abi_base_strings[N_ABI_BASE_TYPES] = {
+  [ABI_BASE_LP64D] = STR_ABI_BASE_LP64D,
+  [ABI_BASE_LP64F] = STR_ABI_BASE_LP64F,
+  [ABI_BASE_LP64S] = STR_ABI_BASE_LP64S,
+};
+
+const char*
+loongarch_abi_ext_strings[N_ABI_EXT_TYPES] = {
+  [ABI_EXT_BASE] = STR_ABI_EXT_BASE,
+};
+
+const char*
+loongarch_cmodel_strings[] = {
+  [CMODEL_NORMAL]	  = STR_CMODEL_NORMAL,
+  [CMODEL_TINY]		  = STR_CMODEL_TINY,
+  [CMODEL_TINY_STATIC]	  = STR_CMODEL_TS,
+  [CMODEL_LARGE]	  = STR_CMODEL_LARGE,
+  [CMODEL_EXTREME]	  = STR_CMODEL_EXTREME,
+};
+
+const char*
+loongarch_switch_strings[] = {
+  [SW_SOFT_FLOAT]	  = OPTSTR_SOFT_FLOAT,
+  [SW_SINGLE_FLOAT]	  = OPTSTR_SINGLE_FLOAT,
+  [SW_DOUBLE_FLOAT]	  = OPTSTR_DOUBLE_FLOAT,
+};
+
+
+/* ABI-related definitions.  */
+const struct loongarch_isa
+abi_minimal_isa[N_ABI_BASE_TYPES][N_ABI_EXT_TYPES] = {
+  [ABI_BASE_LP64D] = {
+      [ABI_EXT_BASE] = {.base = ISA_BASE_LA64V100, .fpu = ISA_EXT_FPU64},
+  },
+  [ABI_BASE_LP64F] = {
+      [ABI_EXT_BASE] = {.base = ISA_BASE_LA64V100, .fpu = ISA_EXT_FPU32},
+  },
+  [ABI_BASE_LP64S] = {
+      [ABI_EXT_BASE] = {.base = ISA_BASE_LA64V100, .fpu = ISA_EXT_NOFPU},
+  },
+};
diff --git a/gcc/config/loongarch/loongarch-def.h b/gcc/config/loongarch/loongarch-def.h
new file mode 100644
index 00000000000..c2c35b6ba5c
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-def.h
@@ -0,0 +1,151 @@
+/* LoongArch definitions.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* Definition of standard codes for:
+    - base architecture types	(isa_base),
+    - ISA extensions		(isa_ext),
+    - base ABI types		(abi_base),
+    - ABI extension types	(abi_ext).
+
+    - code models		      (cmodel)
+    - other command-line switches     (switch)
+
+   These values are primarily used for implementing option handling
+   logic in "loongarch.opt", "loongarch-driver.c" and "loongarch-opt.c".
+
+   As for the result of this option handling process, the following
+   scheme is adopted to represent the final configuration:
+
+    - The target ABI is encoded with a tuple (abi_base, abi_ext)
+      using the code defined below.
+
+    - The target ISA is encoded with a "struct loongarch_isa" defined
+      in loongarch-cpu.h.
+
+    - The target microarchitecture is represented with a cpu model
+      index defined in loongarch-cpu.h.
+*/
+
+#ifndef LOONGARCH_DEF_H
+#define LOONGARCH_DEF_H
+
+#include "loongarch-tune.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* enum isa_base */
+extern const char* loongarch_isa_base_strings[];
+#define ISA_BASE_LA64V100     0
+#define N_ISA_BASE_TYPES      1
+
+/* enum isa_ext_* */
+extern const char* loongarch_isa_ext_strings[];
+#define ISA_EXT_NOFPU	      0
+#define ISA_EXT_FPU32	      1
+#define ISA_EXT_FPU64	      2
+#define N_ISA_EXT_FPU_TYPES   3
+#define N_ISA_EXT_TYPES	      3
+
+/* enum abi_base */
+extern const char* loongarch_abi_base_strings[];
+#define ABI_BASE_LP64D	      0
+#define ABI_BASE_LP64F	      1
+#define ABI_BASE_LP64S	      2
+#define N_ABI_BASE_TYPES      3
+
+/* enum abi_ext */
+extern const char* loongarch_abi_ext_strings[];
+#define ABI_EXT_BASE	      0
+#define N_ABI_EXT_TYPES	      1
+
+/* enum cmodel */
+extern const char* loongarch_cmodel_strings[];
+#define CMODEL_NORMAL	      0
+#define CMODEL_TINY	      1
+#define CMODEL_TINY_STATIC    2
+#define CMODEL_LARGE	      3
+#define CMODEL_EXTREME	      4
+#define N_CMODEL_TYPES	      5
+
+/* enum switches */
+/* The "SW_" codes represent command-line switches (options that
+   accept no parameters). Definition for other switches that affects
+   the target ISA / ABI configuration will also be appended here
+   in the future.  */
+
+extern const char* loongarch_switch_strings[];
+#define SW_SOFT_FLOAT	      0
+#define SW_SINGLE_FLOAT	      1
+#define SW_DOUBLE_FLOAT	      2
+#define N_SWITCH_TYPES	      3
+
+/* The common default value for variables whose assignments
+   are triggered by command-line options.  */
+
+#define M_OPTION_NOT_SEEN -1
+#define M_OPT_ABSENT(opt_enum)  ((opt_enum) == M_OPTION_NOT_SEEN)
+
+
+/* Internal representation of the target.  */
+struct loongarch_isa
+{
+  unsigned char base;	    /* ISA_BASE_ */
+  unsigned char fpu;	    /* ISA_EXT_FPU_ */
+};
+
+struct loongarch_abi
+{
+  unsigned char base;	    /* ABI_BASE_ */
+  unsigned char ext;	    /* ABI_EXT_ */
+};
+
+struct loongarch_target
+{
+  struct loongarch_isa isa;
+  struct loongarch_abi abi;
+  unsigned char cpu_arch;   /* CPU_ */
+  unsigned char cpu_tune;   /* same */
+  unsigned char cpu_native; /* same */
+  unsigned char cmodel;	    /* CMODEL_ */
+};
+
+/* CPU properties.  */
+/* index */
+#define CPU_NATIVE	  0
+#define CPU_LOONGARCH64	  1
+#define CPU_LA464	  2
+#define N_ARCH_TYPES	  3
+#define N_TUNE_TYPES	  3
+
+/* parallel tables.  */
+extern const char* loongarch_cpu_strings[];
+extern struct loongarch_isa loongarch_cpu_default_isa[];
+extern int loongarch_cpu_issue_rate[];
+extern int loongarch_cpu_multipass_dfa_lookahead[];
+
+extern struct loongarch_cache loongarch_cpu_cache[];
+extern struct loongarch_rtx_cost_data loongarch_cpu_rtx_cost_data[];
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* LOONGARCH_DEF_H */
diff --git a/gcc/config/loongarch/loongarch-driver.cc b/gcc/config/loongarch/loongarch-driver.cc
new file mode 100644
index 00000000000..0adcc923b7d
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-driver.cc
@@ -0,0 +1,187 @@
+/* Subroutines for the gcc driver.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "obstack.h"
+#include "diagnostic-core.h"
+
+#include "loongarch-opts.h"
+#include "loongarch-driver.h"
+
+static int
+  opt_arch_driver = M_OPTION_NOT_SEEN,
+  opt_tune_driver = M_OPTION_NOT_SEEN,
+  opt_fpu_driver = M_OPTION_NOT_SEEN,
+  opt_abi_base_driver = M_OPTION_NOT_SEEN,
+  opt_abi_ext_driver = M_OPTION_NOT_SEEN,
+  opt_cmodel_driver = M_OPTION_NOT_SEEN;
+
+int opt_switches = 0;
+
+/* This flag is set to 1 if we believe that the user might be avoiding
+   linking (implicitly) against something from the startfile search paths.  */
+static int no_link = 0;
+
+#define LARCH_DRIVER_SET_M_FLAG(OPTS_ARRAY, N_OPTS, FLAG, STR)	\
+  for (int i = 0; i < (N_OPTS); i++)				\
+  {								\
+    if ((OPTS_ARRAY)[i] != 0)					\
+      if (strcmp ((STR), (OPTS_ARRAY)[i]) == 0)			\
+	(FLAG) = i;						\
+  }
+
+/* Use the public obstack from the gcc driver (defined in gcc.c).
+   This is for allocating space for the returned string.  */
+extern struct obstack opts_obstack;
+
+#define APPEND_LTR(S)				      \
+  obstack_grow (&opts_obstack, (const void*) (S),     \
+		sizeof ((S)) / sizeof (char) -1)
+
+#define APPEND_VAL(S) \
+  obstack_grow (&opts_obstack, (const void*) (S), strlen ((S)))
+
+
+const char*
+driver_set_m_flag (int argc, const char **argv)
+{
+  int parm_off = 0;
+
+  if (argc != 1)
+    return "%eset_m_flag requires exactly 1 argument.";
+
+#undef PARM
+#define PARM (argv[0] + parm_off)
+
+/* Note: sizeof (OPTSTR_##NAME) equals the length of "<option>=".  */
+#undef MATCH_OPT
+#define MATCH_OPT(NAME) \
+  (strncmp (argv[0], OPTSTR_##NAME "=", \
+	    (parm_off = sizeof (OPTSTR_##NAME))) == 0)
+
+  if (strcmp (argv[0], "no_link") == 0)
+    {
+      no_link = 1;
+    }
+  else if (MATCH_OPT (ABI_BASE))
+    {
+      LARCH_DRIVER_SET_M_FLAG (
+	loongarch_abi_base_strings, N_ABI_BASE_TYPES,
+	opt_abi_base_driver, PARM)
+    }
+  else if (MATCH_OPT (ISA_EXT_FPU))
+    {
+      LARCH_DRIVER_SET_M_FLAG (loongarch_isa_ext_strings, N_ISA_EXT_FPU_TYPES,
+			       opt_fpu_driver, PARM)
+    }
+  else if (MATCH_OPT (ARCH))
+    {
+      LARCH_DRIVER_SET_M_FLAG (loongarch_cpu_strings, N_ARCH_TYPES,
+			       opt_arch_driver, PARM)
+    }
+  else if (MATCH_OPT (TUNE))
+    {
+      LARCH_DRIVER_SET_M_FLAG (loongarch_cpu_strings, N_TUNE_TYPES,
+			       opt_tune_driver, PARM)
+    }
+  else if (MATCH_OPT (CMODEL))
+    {
+      LARCH_DRIVER_SET_M_FLAG (loongarch_cmodel_strings, N_CMODEL_TYPES,
+			       opt_cmodel_driver, PARM)
+    }
+  else /* switches */
+    {
+      int switch_idx = M_OPTION_NOT_SEEN;
+
+      LARCH_DRIVER_SET_M_FLAG (loongarch_switch_strings, N_SWITCH_TYPES,
+			       switch_idx, argv[0])
+
+      if (switch_idx != M_OPTION_NOT_SEEN)
+	opt_switches |= loongarch_switch_mask[switch_idx];
+    }
+  return "";
+}
+
+const char*
+driver_get_normalized_m_opts (int argc, const char **argv)
+{
+  if (argc != 0)
+    {
+      (void) argv;  /* To make compiler shut up about unused argument.  */
+      return " %eget_normalized_m_opts requires no argument.\n";
+    }
+
+  loongarch_config_target (& la_target,
+			   opt_switches,
+			   opt_arch_driver,
+			   opt_tune_driver,
+			   opt_fpu_driver,
+			   opt_abi_base_driver,
+			   opt_abi_ext_driver,
+			   opt_cmodel_driver,
+			   !no_link /* follow_multilib_list */);
+
+  /* Output normalized option strings.  */
+  obstack_blank (&opts_obstack, 0);
+
+#undef APPEND_LTR
+#define APPEND_LTR(S) \
+  obstack_grow (&opts_obstack, (const void*) (S), \
+		sizeof ((S)) / sizeof (char) -1)
+
+#undef APPEND_VAL
+#define APPEND_VAL(S) \
+  obstack_grow (&opts_obstack, (const void*) (S), strlen ((S)))
+
+#undef APPEND_OPT
+#define APPEND_OPT(NAME) \
+   APPEND_LTR (" %<m" OPTSTR_##NAME "=* " \
+	       " -m" OPTSTR_##NAME "=")
+
+  for (int i = 0; i < N_SWITCH_TYPES; i++)
+    {
+      APPEND_LTR (" %<m");
+      APPEND_VAL (loongarch_switch_strings[i]);
+    }
+
+  APPEND_OPT (ABI_BASE);
+  APPEND_VAL (loongarch_abi_base_strings[la_target.abi.base]);
+
+  APPEND_OPT (ARCH);
+  APPEND_VAL (loongarch_cpu_strings[la_target.cpu_arch]);
+
+  APPEND_OPT (ISA_EXT_FPU);
+  APPEND_VAL (loongarch_isa_ext_strings[la_target.isa.fpu]);
+
+  APPEND_OPT (CMODEL);
+  APPEND_VAL (loongarch_cmodel_strings[la_target.cmodel]);
+
+  APPEND_OPT (TUNE);
+  APPEND_VAL (loongarch_cpu_strings[la_target.cpu_tune]);
+
+  obstack_1grow (&opts_obstack, '\0');
+
+  return XOBFINISH (&opts_obstack, const char *);
+}
diff --git a/gcc/config/loongarch/loongarch-driver.h b/gcc/config/loongarch/loongarch-driver.h
new file mode 100644
index 00000000000..c4c83ab640c
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-driver.h
@@ -0,0 +1,69 @@
+/* Subroutine headers for the gcc driver.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef LOONGARCH_DRIVER_H
+#define LOONGARCH_DRIVER_H
+
+#include "loongarch-str.h"
+
+extern const char*
+driver_set_m_flag (int argc, const char **argv);
+
+extern const char*
+driver_get_normalized_m_opts (int argc, const char **argv);
+
+#define EXTRA_SPEC_FUNCTIONS \
+  { "set_m_flag", driver_set_m_flag  }, \
+  { "get_normalized_m_opts", driver_get_normalized_m_opts  },
+
+/* Pre-process ABI-related options.  */
+#define LA_SET_PARM_SPEC(NAME) \
+  " %{m" OPTSTR_##NAME  "=*: %:set_m_flag(" OPTSTR_##NAME "=%*)}" \
+
+#define LA_SET_FLAG_SPEC(NAME) \
+  " %{m" OPTSTR_##NAME  ": %:set_m_flag(" OPTSTR_##NAME ")}" \
+
+#define DRIVER_HANDLE_MACHINE_OPTIONS			      \
+  " %{c|S|E|nostdlib: %:set_m_flag(no_link)}"		      \
+  " %{nostartfiles: %{nodefaultlibs: %:set_m_flag(no_link)}}" \
+  LA_SET_PARM_SPEC (ABI_BASE)				      \
+  LA_SET_PARM_SPEC (ARCH)				      \
+  LA_SET_PARM_SPEC (TUNE)				      \
+  LA_SET_PARM_SPEC (ISA_EXT_FPU)			      \
+  LA_SET_PARM_SPEC (CMODEL)				      \
+  LA_SET_FLAG_SPEC (SOFT_FLOAT)				      \
+  LA_SET_FLAG_SPEC (SINGLE_FLOAT)			      \
+  LA_SET_FLAG_SPEC (DOUBLE_FLOAT)			      \
+  " %:get_normalized_m_opts()"
+
+#define DRIVER_SELF_SPECS \
+  DRIVER_HANDLE_MACHINE_OPTIONS
+
+/* ABI spec strings.  */
+#define ABI_GRLEN_SPEC \
+  "%{mabi=lp64*:64}"   \
+
+#define ABI_SPEC \
+  "%{mabi=lp64d:lp64d}" \
+  "%{mabi=lp64f:lp64f}" \
+  "%{mabi=lp64s:lp64s}" \
+
+#endif /* LOONGARCH_DRIVER_H */
+
diff --git a/gcc/config/loongarch/loongarch-opts.cc b/gcc/config/loongarch/loongarch-opts.cc
new file mode 100644
index 00000000000..74741d46cab
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-opts.cc
@@ -0,0 +1,580 @@
+/* Subroutines for loongarch-specific option handling.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "obstack.h"
+#include "diagnostic-core.h"
+
+#include "loongarch-cpu.h"
+#include "loongarch-opts.h"
+#include "loongarch-str.h"
+
+struct loongarch_target la_target;
+
+/* ABI-related configuration.  */
+#define ABI_COUNT (sizeof(abi_priority_list)/sizeof(struct loongarch_abi))
+static const struct loongarch_abi
+abi_priority_list[] = {
+    {ABI_BASE_LP64D, ABI_EXT_BASE},
+    {ABI_BASE_LP64F, ABI_EXT_BASE},
+    {ABI_BASE_LP64S, ABI_EXT_BASE},
+};
+
+/* Initialize enabled_abi_types from TM_MULTILIB_LIST.  */
+#ifdef __DISABLE_MULTILIB
+#define MULTILIB_LIST_LEN 1
+#else
+#define MULTILIB_LIST_LEN (sizeof (tm_multilib_list) / sizeof (int) / 2)
+static const int tm_multilib_list[] = { TM_MULTILIB_LIST };
+#endif
+static int enabled_abi_types[N_ABI_BASE_TYPES][N_ABI_EXT_TYPES] = { 0 };
+
+#define isa_required(ABI) (abi_minimal_isa[(ABI).base][(ABI).ext])
+extern "C" const struct loongarch_isa
+abi_minimal_isa[N_ABI_BASE_TYPES][N_ABI_EXT_TYPES];
+
+static inline int
+is_multilib_enabled (struct loongarch_abi abi)
+{
+  return enabled_abi_types[abi.base][abi.ext];
+}
+
+static void
+init_enabled_abi_types ()
+{
+#ifdef __DISABLE_MULTILIB
+  enabled_abi_types[DEFAULT_ABI_BASE][DEFAULT_ABI_EXT] = 1;
+#else
+  int abi_base, abi_ext;
+  for (unsigned int i = 0; i < MULTILIB_LIST_LEN; i++)
+    {
+      abi_base = tm_multilib_list[i << 1];
+      abi_ext = tm_multilib_list[(i << 1) + 1];
+      enabled_abi_types[abi_base][abi_ext] = 1;
+    }
+#endif
+}
+
+/* Switch masks.  */
+#undef M
+#define M(NAME) OPTION_MASK_##NAME
+const int loongarch_switch_mask[N_SWITCH_TYPES] = {
+  /* SW_SOFT_FLOAT */    M(FORCE_SOFTF),
+  /* SW_SINGLE_FLOAT */  M(FORCE_F32),
+  /* SW_DOUBLE_FLOAT */  M(FORCE_F64),
+};
+#undef M
+
+/* String processing.  */
+static struct obstack msg_obstack;
+#define APPEND_STRING(STR) obstack_grow (&msg_obstack, STR, strlen(STR));
+#define APPEND1(CH) obstack_1grow(&msg_obstack, CH);
+
+static const char* abi_str (struct loongarch_abi abi);
+static const char* isa_str (const struct loongarch_isa *isa, char separator);
+static const char* arch_str (const struct loongarch_target *target);
+static const char* multilib_enabled_abi_list ();
+
+/* Misc */
+static struct loongarch_abi isa_default_abi (const struct loongarch_isa *isa);
+static int isa_base_compat_p (const struct loongarch_isa *set1,
+			      const struct loongarch_isa *set2);
+static int isa_fpu_compat_p (const struct loongarch_isa *set1,
+			     const struct loongarch_isa *set2);
+static int abi_compat_p (const struct loongarch_isa *isa,
+			 struct loongarch_abi abi);
+static int abi_default_cpu_arch (struct loongarch_abi abi);
+
+/* Checking configure-time defaults.  */
+#ifndef DEFAULT_ABI_BASE
+#error missing definition of DEFAULT_ABI_BASE in ${tm_defines}.
+#endif
+
+#ifndef DEFAULT_ABI_EXT
+#error missing definition of DEFAULT_ABI_EXT in ${tm_defines}.
+#endif
+
+#ifndef DEFAULT_CPU_ARCH
+#error missing definition of DEFAULT_CPU_ARCH in ${tm_defines}.
+#endif
+
+#ifndef DEFAULT_ISA_EXT_FPU
+#error missing definition of DEFAULT_ISA_EXT_FPU in ${tm_defines}.
+#endif
+
+/* Handle combinations of -m machine option values
+   (see loongarch.opt and loongarch-opts.h).  */
+void
+loongarch_config_target (struct loongarch_target *target,
+			 HOST_WIDE_INT opt_switches,
+			 int opt_arch, int opt_tune, int opt_fpu,
+			 int opt_abi_base, int opt_abi_ext,
+			 int opt_cmodel, int follow_multilib_list)
+{
+  struct loongarch_target t;
+
+  if (!target)
+    return;
+
+  /* Initialization */
+  init_enabled_abi_types ();
+  obstack_init (&msg_obstack);
+
+  struct {
+    int arch, tune, fpu, abi_base, abi_ext, cmodel;
+  } constrained = {
+      M_OPT_ABSENT(opt_arch)     ? 0 : 1,
+      M_OPT_ABSENT(opt_tune)     ? 0 : 1,
+      M_OPT_ABSENT(opt_fpu)      ? 0 : 1,
+      M_OPT_ABSENT(opt_abi_base) ? 0 : 1,
+      M_OPT_ABSENT(opt_abi_ext)  ? 0 : 1,
+      M_OPT_ABSENT(opt_cmodel)   ? 0 : 1,
+  };
+
+#define on(NAME) ((loongarch_switch_mask[(SW_##NAME)] & opt_switches) \
+		  && (on_switch = (SW_##NAME), 1))
+  int on_switch;
+
+  /* 1.  Target ABI */
+  t.abi.base = constrained.abi_base ? opt_abi_base : DEFAULT_ABI_BASE;
+
+  t.abi.ext = constrained.abi_ext ? opt_abi_ext : DEFAULT_ABI_EXT;
+
+  /* Extra switch handling.  */
+  if (on (SOFT_FLOAT) || on (SINGLE_FLOAT) || on (DOUBLE_FLOAT))
+    {
+      switch (on_switch)
+	{
+	  case SW_SOFT_FLOAT:
+	    opt_fpu = ISA_EXT_NOFPU;
+	    break;
+
+	  case SW_SINGLE_FLOAT:
+	    opt_fpu = ISA_EXT_FPU32;
+	    break;
+
+	  case SW_DOUBLE_FLOAT:
+	    opt_fpu = ISA_EXT_FPU64;
+	    break;
+
+	  default:
+	    gcc_unreachable();
+	}
+      constrained.fpu = 1;
+
+      /* The target ISA is not ready yet, but (isa_required (t.abi)
+	 + forced fpu) is enough for computing the forced base ABI.  */
+      struct loongarch_isa default_isa = isa_required (t.abi);
+      struct loongarch_isa force_isa = default_isa;
+      struct loongarch_abi force_abi = t.abi;
+      force_isa.fpu = opt_fpu;
+      force_abi.base = isa_default_abi (&force_isa).base;
+
+      if (isa_fpu_compat_p (&default_isa, &force_isa));
+	/* Keep quiet if -m*-float does not promote the FP ABI.  */
+      else if (constrained.abi_base && (t.abi.base != force_abi.base))
+	inform (UNKNOWN_LOCATION,
+		"%<-m%s%> overrides %<-m%s=%s%>, promoting ABI to %qs",
+		loongarch_switch_strings[on_switch],
+		OPTSTR_ABI_BASE, loongarch_abi_base_strings[t.abi.base],
+		abi_str (force_abi));
+
+      t.abi.base = force_abi.base;
+    }
+
+#ifdef __DISABLE_MULTILIB
+  if (follow_multilib_list)
+    if (t.abi.base != DEFAULT_ABI_BASE || t.abi.ext != DEFAULT_ABI_EXT)
+      {
+	static const struct loongarch_abi default_abi
+	  = {DEFAULT_ABI_BASE, DEFAULT_ABI_EXT};
+
+	warning (0, "ABI changed (%qs to %qs) while multilib is disabled",
+		 abi_str (default_abi), abi_str (t.abi));
+      }
+#endif
+
+  /* 2.  Target CPU */
+  t.cpu_arch = constrained.arch ? opt_arch : DEFAULT_CPU_ARCH;
+
+  t.cpu_tune = constrained.tune ? opt_tune
+    : (constrained.arch ? DEFAULT_CPU_ARCH : DEFAULT_CPU_TUNE);
+
+#ifdef __loongarch__
+  /* For native compilers, gather local CPU information
+     and fill the "CPU_NATIVE" index of arrays defined in
+     loongarch-cpu.c.  */
+
+  t.cpu_native = fill_native_cpu_config (t.cpu_arch == CPU_NATIVE,
+					 t.cpu_tune == CPU_NATIVE);
+
+#else
+  if (t.cpu_arch == CPU_NATIVE)
+    fatal_error (UNKNOWN_LOCATION,
+		 "%qs does not work on a cross compiler",
+		 "-m" OPTSTR_ARCH "=" STR_CPU_NATIVE);
+
+  else if (t.cpu_tune == CPU_NATIVE)
+    fatal_error (UNKNOWN_LOCATION,
+		 "%qs does not work on a cross compiler",
+		 "-m" OPTSTR_TUNE "=" STR_CPU_NATIVE);
+#endif
+
+  /* 3.  Target ISA */
+config_target_isa:
+
+  /* Get default ISA from "-march" or its default value.  */
+  t.isa = loongarch_cpu_default_isa[__ACTUAL_ARCH];
+
+  /* Apply incremental changes.  */
+  /* "-march=native" overrides the default FPU type.  */
+  t.isa.fpu = constrained.fpu ? opt_fpu :
+    ((t.cpu_arch == CPU_NATIVE && constrained.arch) ?
+     t.isa.fpu : DEFAULT_ISA_EXT_FPU);
+
+
+  /* 4.  ABI-ISA compatibility */
+  /* Note:
+     - There IS a unique default -march value for each ABI type
+       (config.gcc: triplet -> abi -> default arch).
+
+     - If the base ABI is incompatible with the default arch,
+       try using the default -march it implies (and mark it
+       as "constrained" this time), then re-apply step 3.
+  */
+  struct loongarch_abi abi_tmp;
+  const struct loongarch_isa* isa_min;
+
+  abi_tmp = t.abi;
+  isa_min = &isa_required (abi_tmp);
+
+  if (isa_base_compat_p (&t.isa, isa_min)); /* OK.  */
+  else if (!constrained.arch)
+    {
+      /* Base architecture can only be implied by -march,
+	 so we adjust that first if it is not constrained.  */
+      int fallback_arch = abi_default_cpu_arch (t.abi);
+
+      if (t.cpu_arch == CPU_NATIVE)
+	warning (0, "your native CPU architecture (%qs) "
+		 "does not support %qs ABI, falling back to %<-m%s=%s%>",
+		 arch_str (&t), abi_str (t.abi), OPTSTR_ARCH,
+		 loongarch_cpu_strings[fallback_arch]);
+      else
+	warning (0, "default CPU architecture (%qs) "
+		 "does not support %qs ABI, falling back to %<-m%s=%s%>",
+		 arch_str (&t), abi_str (t.abi), OPTSTR_ARCH,
+		 loongarch_cpu_strings[fallback_arch]);
+
+      t.cpu_arch = fallback_arch;
+      constrained.arch = 1;
+      goto config_target_isa;
+    }
+  else if (!constrained.abi_base)
+    {
+      /* If -march is given while -mabi is not,
+	 try selecting another base ABI type.  */
+      abi_tmp.base = isa_default_abi (&t.isa).base;
+    }
+  else
+    goto fatal;
+
+  if (isa_fpu_compat_p (&t.isa, isa_min)); /* OK.  */
+  else if (!constrained.fpu)
+    t.isa.fpu = isa_min->fpu;
+  else if (!constrained.abi_base)
+      /* If -march is compatible with the default ABI
+	 while -mfpu is not.  */
+    abi_tmp.base = isa_default_abi (&t.isa).base;
+  else
+    goto fatal;
+
+  if (0)
+fatal:
+    fatal_error (UNKNOWN_LOCATION,
+		 "unable to implement ABI %qs with instruction set %qs",
+		 abi_str (t.abi), isa_str (&t.isa, '/'));
+
+
+  /* Using the fallback ABI.  */
+  if (abi_tmp.base != t.abi.base || abi_tmp.ext != t.abi.ext)
+    {
+      /* This flag is only set in the GCC driver.  */
+      if (follow_multilib_list)
+	{
+
+	  /* Continue falling back until we find a feasible ABI type
+	     enabled by TM_MULTILIB_LIST.  */
+	  if (!is_multilib_enabled (abi_tmp))
+	    {
+	      for (unsigned int i = 0; i < ABI_COUNT; i++)
+		{
+		  if (is_multilib_enabled (abi_priority_list[i])
+		      && abi_compat_p (&t.isa, abi_priority_list[i]))
+		    {
+		      abi_tmp = abi_priority_list[i];
+
+		      warning (0, "ABI %qs cannot be implemented due to "
+			       "limited instruction set %qs, "
+			       "falling back to %qs", abi_str (t.abi),
+			       isa_str (&t.isa, '/'), abi_str (abi_tmp));
+
+		      goto fallback;
+		    }
+		}
+
+	      /* Otherwise, keep using abi_tmp with a warning.  */
+#ifdef __DISABLE_MULTILIB
+	      warning (0, "instruction set %qs cannot implement "
+		       "default ABI %qs, falling back to %qs",
+		       isa_str (&t.isa, '/'), abi_str (t.abi),
+		       abi_str (abi_tmp));
+#else
+	      warning (0, "no multilib-enabled ABI (%qs) can be implemented "
+		       "with instruction set %qs, falling back to %qs",
+		       multilib_enabled_abi_list (),
+		       isa_str (&t.isa, '/'), abi_str (abi_tmp));
+#endif
+	    }
+	}
+
+fallback:
+      t.abi = abi_tmp;
+    }
+  else if (follow_multilib_list)
+    {
+      if (!is_multilib_enabled (t.abi))
+	{
+	  inform (UNKNOWN_LOCATION,
+		  "ABI %qs is not enabled at configure-time, "
+		  "the linker might report an error", abi_str (t.abi));
+
+	  inform (UNKNOWN_LOCATION, "ABI with startfiles: %s",
+		  multilib_enabled_abi_list ());
+	}
+    }
+
+
+  /* 5.  Target code model */
+  t.cmodel = constrained.cmodel ? opt_cmodel : CMODEL_NORMAL;
+
+  /* cleanup and return */
+  obstack_free (&msg_obstack, NULL);
+  *target = t;
+}
+
+/* Returns the default ABI for the given instruction set.  */
+static inline struct loongarch_abi
+isa_default_abi (const struct loongarch_isa *isa)
+{
+  struct loongarch_abi abi;
+
+  switch (isa->fpu)
+    {
+      case ISA_EXT_FPU64:
+	if (isa->base == ISA_BASE_LA64V100)
+	  abi.base = ABI_BASE_LP64D;
+	break;
+
+      case ISA_EXT_FPU32:
+	if (isa->base == ISA_BASE_LA64V100)
+	  abi.base = ABI_BASE_LP64F;
+	break;
+
+      case ISA_EXT_NOFPU:
+	if (isa->base == ISA_BASE_LA64V100)
+	  abi.base = ABI_BASE_LP64S;
+	break;
+
+      default:
+	gcc_unreachable ();
+    }
+
+  abi.ext = ABI_EXT_BASE;
+  return abi;
+}
+
+/* Check if set2 is a subset of set1.  */
+static inline int
+isa_base_compat_p (const struct loongarch_isa *set1,
+		   const struct loongarch_isa *set2)
+{
+  switch (set2->base)
+    {
+      case ISA_BASE_LA64V100:
+	return (set1->base == ISA_BASE_LA64V100);
+
+      default:
+	gcc_unreachable ();
+    }
+}
+
+static inline int
+isa_fpu_compat_p (const struct loongarch_isa *set1,
+		  const struct loongarch_isa *set2)
+{
+  switch (set2->fpu)
+    {
+      case ISA_EXT_FPU64:
+	return set1->fpu == ISA_EXT_FPU64;
+
+      case ISA_EXT_FPU32:
+	return set1->fpu == ISA_EXT_FPU32 || set1->fpu == ISA_EXT_FPU64;
+
+      case ISA_EXT_NOFPU:
+	return 1;
+
+      default:
+	gcc_unreachable ();
+    }
+
+}
+
+static inline int
+abi_compat_p (const struct loongarch_isa *isa, struct loongarch_abi abi)
+{
+  int compatible = 1;
+  const struct loongarch_isa *isa2 = &isa_required (abi);
+
+  /* Append conditionals for new ISA components below.  */
+  compatible = compatible && isa_base_compat_p (isa, isa2);
+  compatible = compatible && isa_fpu_compat_p (isa, isa2);
+  return compatible;
+}
+
+/* The behavior of this function should be consistent
+   with config.gcc.  */
+static inline int
+abi_default_cpu_arch (struct loongarch_abi abi)
+{
+  switch (abi.base)
+    {
+      case ABI_BASE_LP64D:
+      case ABI_BASE_LP64F:
+      case ABI_BASE_LP64S:
+	if (abi.ext == ABI_EXT_BASE)
+	  return CPU_LOONGARCH64;
+    }
+  gcc_unreachable ();
+}
+
+static const char*
+abi_str (struct loongarch_abi abi)
+{
+  /* "/base" can be omitted */
+  if (abi.ext == ABI_EXT_BASE)
+    return (const char*)
+      obstack_copy0 (&msg_obstack, loongarch_abi_base_strings[abi.base],
+		     strlen (loongarch_abi_base_strings[abi.base]));
+  else
+    {
+      APPEND_STRING (loongarch_abi_base_strings[abi.base])
+      APPEND1 ('/')
+      APPEND_STRING (loongarch_abi_ext_strings[abi.ext])
+      APPEND1 ('\0')
+
+      return XOBFINISH (&msg_obstack, const char *);
+    }
+}
+
+static const char*
+isa_str (const struct loongarch_isa *isa, char separator)
+{
+  APPEND_STRING (loongarch_isa_base_strings[isa->base])
+  APPEND1 (separator)
+
+  if (isa->fpu == ISA_EXT_NOFPU)
+    {
+      APPEND_STRING ("no" OPTSTR_ISA_EXT_FPU)
+    }
+  else
+    {
+      APPEND_STRING (OPTSTR_ISA_EXT_FPU)
+      APPEND_STRING (loongarch_isa_ext_strings[isa->fpu])
+    }
+  APPEND1 ('\0')
+
+  /* Add more here.  */
+
+  return XOBFINISH (&msg_obstack, const char *);
+}
+
+static const char*
+arch_str (const struct loongarch_target *target)
+{
+  if (target->cpu_arch == CPU_NATIVE)
+    {
+      if (target->cpu_native == CPU_NATIVE)
+	{
+	  /* Describe a native CPU with unknown PRID.  */
+	  const char* isa_string = isa_str (&target->isa, ',');
+	  APPEND_STRING ("PRID: 0x")
+	  APPEND_STRING (get_native_prid_str ())
+	  APPEND_STRING (", ISA features: ")
+	  APPEND_STRING (isa_string)
+	  APPEND1 ('\0')
+	}
+      else
+	APPEND_STRING (loongarch_cpu_strings[target->cpu_native]);
+    }
+  else
+    APPEND_STRING (loongarch_cpu_strings[target->cpu_arch]);
+
+  APPEND1 ('\0')
+  return XOBFINISH (&msg_obstack, const char *);
+}
+
+static const char*
+multilib_enabled_abi_list ()
+{
+  int enabled_abi_idx[MULTILIB_LIST_LEN] = { 0 };
+  const char* enabled_abi_str[MULTILIB_LIST_LEN] = { NULL };
+  unsigned int j = 0;
+
+  for (unsigned int i = 0; i < ABI_COUNT && j < MULTILIB_LIST_LEN; i++)
+    {
+      if (enabled_abi_types[abi_priority_list[i].base]
+	  [abi_priority_list[i].ext])
+	{
+	  enabled_abi_idx[j++] = i;
+	}
+    }
+
+  for (unsigned int k = 0; k < j; k++)
+    {
+      enabled_abi_str[k] = abi_str (abi_priority_list[enabled_abi_idx[k]]);
+    }
+
+  for (unsigned int k = 0; k < j - 1; k++)
+    {
+      APPEND_STRING (enabled_abi_str[k])
+      APPEND1 (',')
+      APPEND1 (' ')
+    }
+  APPEND_STRING (enabled_abi_str[j - 1])
+  APPEND1 ('\0')
+
+  return XOBFINISH (&msg_obstack, const char *);
+}
diff --git a/gcc/config/loongarch/loongarch-opts.h b/gcc/config/loongarch/loongarch-opts.h
new file mode 100644
index 00000000000..ffd26e4397c
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-opts.h
@@ -0,0 +1,86 @@
+/* Definitions for loongarch-specific option handling.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef LOONGARCH_OPTS_H
+#define LOONGARCH_OPTS_H
+
+
+/* Target configuration */
+extern struct loongarch_target la_target;
+
+/* Switch masks */
+extern const int loongarch_switch_mask[];
+
+#include "loongarch-def.h"
+
+#if !defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS) && !defined(IN_RTS)
+/* Handler for "-m" option combinations,
+   shared by the driver and the compiler proper.  */
+void
+loongarch_config_target (struct loongarch_target *target,
+  HOST_WIDE_INT opt_switches,
+  int opt_arch, int opt_tune, int opt_fpu,
+  int opt_abi_base, int opt_abi_ext,
+  int opt_cmodel, int follow_multilib_list);
+#endif
+
+
+/* Macros for common conditional expressions used in loongarch.{c,h,md} */
+#define TARGET_CMODEL_NORMAL	    (la_target.cmodel == CMODEL_NORMAL)
+#define TARGET_CMODEL_TINY	    (la_target.cmodel == CMODEL_TINY)
+#define TARGET_CMODEL_TINY_STATIC   (la_target.cmodel == CMODEL_TINY_STATIC)
+#define TARGET_CMODEL_LARGE	    (la_target.cmodel == CMODEL_LARGE)
+#define TARGET_CMODEL_EXTREME	    (la_target.cmodel == CMODEL_EXTREME)
+
+#define TARGET_HARD_FLOAT	    (la_target.isa.fpu != ISA_EXT_NOFPU)
+#define TARGET_HARD_FLOAT_ABI	    (la_target.abi.base == ABI_BASE_LP64D \
+				     || la_target.abi.base == ABI_BASE_LP64F)
+
+#define TARGET_SOFT_FLOAT	  (la_target.isa.fpu == ISA_EXT_NOFPU)
+#define TARGET_SOFT_FLOAT_ABI	  (la_target.abi.base == ABI_BASE_LP64S)
+#define TARGET_SINGLE_FLOAT	  (la_target.isa.fpu == ISA_EXT_FPU32)
+#define TARGET_SINGLE_FLOAT_ABI	  (la_target.abi.base == ABI_BASE_LP64F)
+#define TARGET_DOUBLE_FLOAT	  (la_target.isa.fpu == ISA_EXT_FPU64)
+#define TARGET_DOUBLE_FLOAT_ABI	  (la_target.abi.base == ABI_BASE_LP64D)
+
+#define TARGET_64BIT		  (la_target.isa.base == ISA_BASE_LA64V100)
+#define TARGET_ABI_LP64	  (la_target.abi.base == ABI_BASE_LP64D \
+				   || la_target.abi.base == ABI_BASE_LP64F \
+				   || la_target.abi.base == ABI_BASE_LP64S)
+
+#define TARGET_ARCH_NATIVE	  (la_target.cpu_arch == CPU_NATIVE)
+#define __ACTUAL_ARCH		  (TARGET_ARCH_NATIVE \
+				   ? (la_target.cpu_native < N_ARCH_TYPES \
+				      ? (la_target.cpu_native) : (CPU_NATIVE)) \
+				      : (la_target.cpu_arch))
+
+#define TARGET_TUNE_NATIVE	(la_target.cpu_tune == CPU_NATIVE)
+#define __ACTUAL_TUNE		(TARGET_TUNE_NATIVE \
+				 ? (la_target.cpu_native < N_TUNE_TYPES \
+				    ? (la_target.cpu_native) : (CPU_NATIVE)) \
+				    : (la_target.cpu_tune))
+
+#define TARGET_ARCH_LOONGARCH64	  (__ACTUAL_ARCH == CPU_LOONGARCH64)
+#define TARGET_ARCH_LA464	  (__ACTUAL_ARCH == CPU_LA464)
+
+#define TARGET_TUNE_LOONGARCH64	  (__ACTUAL_TUNE == CPU_LOONGARCH64)
+#define TARGET_TUNE_LA464	  (__ACTUAL_TUNE == CPU_LA464)
+
+#endif /* LOONGARCH_OPTS_H */
diff --git a/gcc/config/loongarch/loongarch-str.h b/gcc/config/loongarch/loongarch-str.h
new file mode 100644
index 00000000000..afe95c9e607
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-str.h
@@ -0,0 +1,57 @@
+/* Generated automatically by "genstr" from "loongarch-strings".
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef LOONGARCH_STR_H
+#define LOONGARCH_STR_H
+
+#define OPTSTR_ARCH "arch"
+#define OPTSTR_TUNE "tune"
+
+#define STR_CPU_NATIVE "native"
+#define STR_CPU_LOONGARCH64 "loongarch64"
+#define STR_CPU_LA464 "la464"
+
+#define STR_ISA_BASE_LA64V100 "la64"
+
+#define OPTSTR_ISA_EXT_FPU "fpu"
+#define STR_ISA_EXT_NOFPU "none"
+#define STR_ISA_EXT_FPU0 "0"
+#define STR_ISA_EXT_FPU32 "32"
+#define STR_ISA_EXT_FPU64 "64"
+
+#define OPTSTR_SOFT_FLOAT "soft-float"
+#define OPTSTR_SINGLE_FLOAT "single-float"
+#define OPTSTR_DOUBLE_FLOAT "double-float"
+
+#define OPTSTR_ABI_BASE "abi"
+#define STR_ABI_BASE_LP64D "lp64d"
+#define STR_ABI_BASE_LP64F "lp64f"
+#define STR_ABI_BASE_LP64S "lp64s"
+
+#define STR_ABI_EXT_BASE "base"
+
+#define OPTSTR_CMODEL "cmodel"
+#define STR_CMODEL_NORMAL "normal"
+#define STR_CMODEL_TINY "tiny"
+#define STR_CMODEL_TS "tiny-static"
+#define STR_CMODEL_LARGE "large"
+#define STR_CMODEL_EXTREME "extreme"
+
+#endif /* LOONGARCH_STR_H */
diff --git a/gcc/config/loongarch/loongarch.opt b/gcc/config/loongarch/loongarch.opt
new file mode 100644
index 00000000000..b67c2ba39fe
--- /dev/null
+++ b/gcc/config/loongarch/loongarch.opt
@@ -0,0 +1,189 @@
+; Generated by "genstr" from the template "loongarch.opt.in"
+; and definitions from "loongarch-strings".
+;
+; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;
+; This file is part of GCC.
+;
+; GCC is free software; you can redistribute it and/or modify it under
+; the terms of the GNU General Public License as published by the Free
+; Software Foundation; either version 3, or (at your option) any later
+; version.
+;
+; GCC is distributed in the hope that it will be useful, but WITHOUT
+; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+; License for more details.
+;
+; You should have received a copy of the GNU General Public License
+; along with GCC; see the file COPYING3.  If not see
+; <http://www.gnu.org/licenses/>.
+;
+
+; Variables (macros) that should be exported by loongarch.opt:
+;   la_opt_switches,
+;   la_opt_abi_base, la_opt_abi_ext,
+;   la_opt_cpu_arch, la_opt_cpu_tune,
+;   la_opt_fpu,
+;   la_cmodel.
+
+HeaderInclude
+config/loongarch/loongarch-opts.h
+
+HeaderInclude
+config/loongarch/loongarch-str.h
+
+Variable
+HOST_WIDE_INT la_opt_switches = 0
+
+; ISA related options
+;; Base ISA
+Enum
+Name(isa_base) Type(int)
+Basic ISAs of LoongArch:
+
+EnumValue
+Enum(isa_base) String(la64) Value(ISA_BASE_LA64V100)
+
+
+;; ISA extensions / adjustments
+Enum
+Name(isa_ext_fpu) Type(int)
+FPU types of LoongArch:
+
+EnumValue
+Enum(isa_ext_fpu) String(none) Value(ISA_EXT_NOFPU)
+
+EnumValue
+Enum(isa_ext_fpu) String(32) Value(ISA_EXT_FPU32)
+
+EnumValue
+Enum(isa_ext_fpu) String(64) Value(ISA_EXT_FPU64)
+
+mfpu=
+Target RejectNegative Joined ToLower Enum(isa_ext_fpu) Var(la_opt_fpu) Init(M_OPTION_NOT_SEEN)
+-mfpu=FPU	Generate code for the given FPU.
+
+mfpu=0
+Target RejectNegative Alias(mfpu=,none)
+
+msoft-float
+Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_SOFTF) Negative(msingle-float)
+Prevent the use of all hardware floating-point instructions.
+
+msingle-float
+Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F32) Negative(mdouble-float)
+Restrict the use of hardware floating-point instructions to 32-bit operations.
+
+mdouble-float
+Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F64) Negative(msoft-float)
+Allow hardware floating-point instructions to cover both 32-bit and 64-bit operations.
+
+
+;; Base target models (implies ISA & tune parameters)
+Enum
+Name(cpu_type) Type(int)
+LoongArch CPU types:
+
+EnumValue
+Enum(cpu_type) String(native) Value(CPU_NATIVE)
+
+EnumValue
+Enum(cpu_type) String(loongarch64) Value(CPU_LOONGARCH64)
+
+EnumValue
+Enum(cpu_type) String(la464) Value(CPU_LA464)
+
+march=
+Target RejectNegative Joined Enum(cpu_type) Var(la_opt_cpu_arch) Init(M_OPTION_NOT_SEEN)
+-march=PROCESSOR	Generate code for the given PROCESSOR ISA.
+
+mtune=
+Target RejectNegative Joined Enum(cpu_type) Var(la_opt_cpu_tune) Init(M_OPTION_NOT_SEEN)
+-mtune=PROCESSOR	Generate optimized code for PROCESSOR.
+
+
+; ABI related options
+; (ISA constraints on ABI are handled dynamically)
+
+;; Base ABI
+Enum
+Name(abi_base) Type(int)
+Base ABI types for LoongArch:
+
+EnumValue
+Enum(abi_base) String(lp64d) Value(ABI_BASE_LP64D)
+
+EnumValue
+Enum(abi_base) String(lp64f) Value(ABI_BASE_LP64F)
+
+EnumValue
+Enum(abi_base) String(lp64s) Value(ABI_BASE_LP64S)
+
+mabi=
+Target RejectNegative Joined ToLower Enum(abi_base) Var(la_opt_abi_base) Init(M_OPTION_NOT_SEEN)
+-mabi=BASEABI	Generate code that conforms to the given BASEABI.
+
+;; ABI Extension
+Variable
+int la_opt_abi_ext = M_OPTION_NOT_SEEN
+
+
+mbranch-cost=
+Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
+-mbranch-cost=COST	Set the cost of branches to roughly COST instructions.
+
+mcheck-zero-division
+Target Mask(CHECK_ZERO_DIV)
+Trap on integer divide by zero.
+
+mcond-move-int
+Target Var(TARGET_COND_MOVE_INT) Init(1)
+Conditional moves for integral are enabled.
+
+mcond-move-float
+Target Var(TARGET_COND_MOVE_FLOAT) Init(1)
+Conditional moves for float are enabled.
+
+mmemcpy
+Target Mask(MEMCPY)
+Don't optimize block moves.
+
+mlra
+Target Var(loongarch_lra_flag) Init(1) Save
+Use LRA instead of reload.
+
+noasmopt
+Driver
+
+mstrict-align
+Target Var(TARGET_STRICT_ALIGN) Init(0)
+Do not generate unaligned memory accesses.
+
+mmax-inline-memcpy-size=
+Target Joined RejectNegative UInteger Var(loongarch_max_inline_memcpy_size) Init(1024)
+-mmax-inline-memcpy-size=SIZE	Set the max size of memcpy to inline, default is 1024.
+
+; The code model option names for -mcmodel.
+Enum
+Name(cmodel) Type(int)
+The code model option names for -mcmodel:
+
+EnumValue
+Enum(cmodel) String(normal) Value(CMODEL_NORMAL)
+
+EnumValue
+Enum(cmodel) String(tiny) Value(CMODEL_TINY)
+
+EnumValue
+Enum(cmodel) String(tiny-static) Value(CMODEL_TINY_STATIC)
+
+EnumValue
+Enum(cmodel) String(large) Value(CMODEL_LARGE)
+
+EnumValue
+Enum(cmodel) String(extreme) Value(CMODEL_EXTREME)
+
+mcmodel=
+Target RejectNegative Joined Enum(cmodel) Var(la_opt_cmodel) Init(CMODEL_NORMAL)
+Specify the code model.
diff --git a/gcc/config/loongarch/t-linux b/gcc/config/loongarch/t-linux
new file mode 100644
index 00000000000..77e2bbded8c
--- /dev/null
+++ b/gcc/config/loongarch/t-linux
@@ -0,0 +1,53 @@
+# Copyright (C) 2021-2022 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Multilib
+MULTILIB_OPTIONS = mabi=lp64d/mabi=lp64f/mabi=lp64s
+MULTILIB_DIRNAMES = base/lp64d base/lp64f base/lp64s
+
+# The GCC driver always gets all abi-related options on the command line.
+# (see loongarch-driver.c:driver_get_normalized_m_opts)
+comma=,
+MULTILIB_REQUIRED = $(subst $(comma), ,$(TM_MULTILIB_CONFIG))
+
+# Multiarch
+ifneq ($(call if_multiarch,yes),yes)
+    # Define __DISABLE_MULTIARCH if multiarch is disabled.
+    tm_defines += __DISABLE_MULTIARCH
+else
+    # Only define MULTIARCH_DIRNAME when multiarch is enabled,
+    # or it would always introduce ${target} into the search path.
+    MULTIARCH_DIRNAME = $(LA_MULTIARCH_TRIPLET)
+endif
+
+# Don't define MULTILIB_OSDIRNAMES if multilib is disabled.
+ifeq ($(filter __DISABLE_MULTILIB,$(tm_defines)),)
+
+    MULTILIB_OSDIRNAMES = \
+      mabi.lp64d=../lib64$\
+      $(call if_multiarch,:loongarch64-linux-gnuf64)
+
+    MULTILIB_OSDIRNAMES += \
+      mabi.lp64f=../lib64/f32$\
+      $(call if_multiarch,:loongarch64-linux-gnuf32)
+
+    MULTILIB_OSDIRNAMES += \
+      mabi.lp64s=../lib64/sf$\
+      $(call if_multiarch,:loongarch64-linux-gnusf)
+
+endif
diff --git a/gcc/config/loongarch/t-loongarch b/gcc/config/loongarch/t-loongarch
new file mode 100644
index 00000000000..c106be1ec45
--- /dev/null
+++ b/gcc/config/loongarch/t-loongarch
@@ -0,0 +1,68 @@
+# Copyright (C) 2021-2022 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Canonical target triplet from config.gcc
+LA_MULTIARCH_TRIPLET = $(patsubst LA_MULTIARCH_TRIPLET=%,%,$\
+$(filter LA_MULTIARCH_TRIPLET=%,$(tm_defines)))
+
+# String definition header
+$(srcdir)/config/loongarch/loongarch-str.h: s-loongarch-str ; @true
+s-loongarch-str: $(srcdir)/config/loongarch/genopts/genstr.sh \
+	$(srcdir)/config/loongarch/genopts/loongarch-strings
+	$(SHELL) $(srcdir)/config/loongarch/genopts/genstr.sh header \
+    $(srcdir)/config/loongarch/genopts/loongarch-strings > \
+    tmp-loongarch-str.h
+	$(SHELL) $(srcdir)/../move-if-change tmp-loongarch-str.h \
+		$(srcdir)/config/loongarch/loongarch-str.h
+	$(STAMP) s-loongarch-str
+
+loongarch-c.o: $(srcdir)/config/loongarch/loongarch-c.cc $(CONFIG_H) $(SYSTEM_H) \
+	coretypes.h $(TM_H) $(TREE_H) output.h $(C_COMMON_H) $(TARGET_H)
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
+	$(srcdir)/config/loongarch/loongarch-c.cc
+
+loongarch-builtins.o: $(srcdir)/config/loongarch/loongarch-builtins.cc $(CONFIG_H) \
+	$(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(TREE_H) $(RECOG_H) langhooks.h \
+	$(DIAGNOSTIC_CORE_H) $(OPTABS_H) $(srcdir)/config/loongarch/loongarch-ftypes.def \
+	$(srcdir)/config/loongarch/loongarch-modes.def
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
+	$(srcdir)/config/loongarch/loongarch-builtins.cc
+
+loongarch-driver.o : $(srcdir)/config/loongarch/loongarch-driver.cc $(LA_STR_H) \
+	$(CONFIG_H) $(SYSTEM_H)
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $<
+
+loongarch-opts.o: $(srcdir)/config/loongarch/loongarch-opts.cc $(LA_STR_H)
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $<
+
+loongarch-cpu.o: $(srcdir)/config/loongarch/loongarch-cpu.cc $(LA_STR_H)
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $<
+
+loongarch-def.o: $(srcdir)/config/loongarch/loongarch-def.c $(LA_STR_H)
+	$(CC) -c $(ALL_CFLAGS) $(INCLUDES) $<
+
+$(srcdir)/config/loongarch/loongarch.opt: s-loongarch-opt ; @true
+s-loongarch-opt: $(srcdir)/config/loongarch/genopts/genstr.sh \
+	$(srcdir)/config/loongarch/genopts/loongarch.opt.in
+	$(SHELL) $(srcdir)/config/loongarch/genopts/genstr.sh opt \
+    $(srcdir)/config/loongarch/genopts/loongarch.opt.in \
+    > tmp-loongarch.opt
+	$(SHELL) $(srcdir)/../move-if-change tmp-loongarch.opt \
+    $(srcdir)/config/loongarch/loongarch.opt
+	$(STAMP) s-loongarch-opt
+
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 90cad309762..ea67fb62292 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -951,6 +951,9 @@ AC_ARG_ENABLE(fixed-point,
     mips*-*-*)
       enable_fixed_point=yes
       ;;
+    loongarch*-*-*)
+      enable_fixed_point=yes
+      ;;
     *)
       AC_MSG_WARN([fixed-point is not supported for this target, ignored])
       enable_fixed_point=no
@@ -3838,6 +3841,17 @@ foo:	data8	25
 	movl	r24 = @tprel(foo#)'
 	tls_as_opt=--fatal-warnings
 	;;
+  loongarch*-*-*)
+    conftest_s='
+	.section .tdata,"awT",@progbits
+x:	.word 2
+	.text
+	la.tls.gd $a0,x
+	bl __tls_get_addr'
+	tls_first_major=0
+	tls_first_minor=0
+	tls_as_opt='--fatal-warnings'
+	;;
   microblaze*-*-*)
     conftest_s='
 	.section .tdata,"awT",@progbits
@@ -5300,6 +5314,17 @@ configured with --enable-newlib-nano-formatted-io.])
       [AC_DEFINE(HAVE_AS_MARCH_ZIFENCEI, 1,
 		 [Define if the assembler understands -march=rv*_zifencei.])])
     ;;
+  loongarch*-*-*)
+    gcc_GAS_CHECK_FEATURE([.dtprelword support],
+      gcc_cv_as_loongarch_dtprelword, [2,18,0],,
+      [.section .tdata,"awT",@progbits
+x:
+	.word 2
+	.text
+	.dtprelword x+0x8000],,
+      [AC_DEFINE(HAVE_AS_DTPRELWORD, 1,
+	  [Define if your assembler supports .dtprelword.])])
+    ;;
     s390*-*-*)
     gcc_GAS_CHECK_FEATURE([.gnu_attribute support],
       gcc_cv_as_s390_gnu_attribute,,
@@ -5333,11 +5358,11 @@ configured with --enable-newlib-nano-formatted-io.])
     ;;
 esac
 
-# Mips and HP-UX need the GNU assembler.
+# Mips, LoongArch and HP-UX need the GNU assembler.
 # Linux on IA64 might be able to use the Intel assembler.
 
 case "$target" in
-  mips*-*-* | *-*-hpux* )
+  mips*-*-* | loongarch*-*-* | *-*-hpux* )
     if test x$gas_flag = xyes \
        || test x"$host" != x"$build" \
        || test ! -x "$gcc_cv_as" \
@@ -5494,8 +5519,8 @@ esac
 # ??? Once 2.11 is released, probably need to add first known working
 # version to the per-target configury.
 case "$cpu_type" in
-  aarch64 | alpha | arc | arm | avr | bfin | cris | csky | i386 | m32c | m68k \
-  | microblaze | mips | nds32 | nios2 | pa | riscv | rs6000 | score | sparc \
+  aarch64 | alpha | arc | arm | avr | bfin | cris | csky | i386 | loongarch | m32c \
+  | m68k | microblaze | mips | nds32 | nios2 | pa | riscv | rs6000 | score | sparc \
   | tilegx | tilepro | visium | xstormy16 | xtensa)
     insn="nop"
     ;;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 03/12] LoongArch Port: Regenerate gcc/configure.
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
  2022-03-04  7:17 ` [PATCH v8 01/12] LoongArch Port: Regenerate configure xuchenghua
  2022-03-04  7:17 ` [PATCH v8 02/12] LoongArch Port: gcc build xuchenghua
@ 2022-03-04  7:18 ` xuchenghua
  2022-03-04  7:18 ` [PATCH v8 04/12] LoongArch Port: Machine description files xuchenghua
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

gcc/
        * configure: Regenerate.
---
 gcc/configure | 66 ++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 60 insertions(+), 6 deletions(-)

diff --git a/gcc/configure b/gcc/configure
index 14b19c8fe0c..1c1195e95cb 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -5442,6 +5442,9 @@ case "${target}" in
 	# sets the default TLS model and affects inlining.
 	PICFLAG_FOR_TARGET=-fPIC
 	;;
+    loongarch*-*-*)
+	PICFLAG_FOR_TARGET=-fpic
+	;;
     mips-sgi-irix6*)
 	# PIC is the default.
 	;;
@@ -7963,6 +7966,9 @@ else
     mips*-*-*)
       enable_fixed_point=yes
       ;;
+    loongarch*-*-*)
+      enable_fixed_point=yes
+      ;;
     *)
       { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: fixed-point is not supported for this target, ignored" >&5
 $as_echo "$as_me: WARNING: fixed-point is not supported for this target, ignored" >&2;}
@@ -19667,7 +19673,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19670 "configure"
+#line 19676 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19773,7 +19779,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19776 "configure"
+#line 19782 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -25556,6 +25562,17 @@ foo:	data8	25
 	movl	r24 = @tprel(foo#)'
 	tls_as_opt=--fatal-warnings
 	;;
+  loongarch*-*-*)
+    conftest_s='
+	.section .tdata,"awT",@progbits
+x:	.word 2
+	.text
+	la.tls.gd $a0,x
+	bl __tls_get_addr'
+	tls_first_major=0
+	tls_first_minor=0
+	tls_as_opt='--fatal-warnings'
+	;;
   microblaze*-*-*)
     conftest_s='
 	.section .tdata,"awT",@progbits
@@ -28780,6 +28797,43 @@ $as_echo "#define HAVE_AS_MARCH_ZIFENCEI 1" >>confdefs.h
 fi
 
     ;;
+  loongarch*-*-*)
+    { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for .dtprelword support" >&5
+$as_echo_n "checking assembler for .dtprelword support... " >&6; }
+if ${gcc_cv_as_loongarch_dtprelword+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_loongarch_dtprelword=no
+  if test x$gcc_cv_as != x; then
+    $as_echo '' > conftest.s
+    if { ac_try='$gcc_cv_as $gcc_cv_as_flags 2,18,0 -o conftest.o conftest.s >&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+    then
+	.section .tdata,"awT",@progbits
+x:
+	.word 2
+	.text
+	.dtprelword x+0x8000
+    else
+      echo "configure: failed program was" >&5
+      cat conftest.s >&5
+    fi
+    rm -f conftest.o conftest.s
+  fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_loongarch_dtprelword" >&5
+$as_echo "$gcc_cv_as_loongarch_dtprelword" >&6; }
+
+if test $gcc_cv_as_loongarch_dtprelword != yes; then
+
+$as_echo "#define HAVE_AS_DTPRELWORD 1" >>confdefs.h
+
+fi
+    ;;
     s390*-*-*)
     { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for .gnu_attribute support" >&5
 $as_echo_n "checking assembler for .gnu_attribute support... " >&6; }
@@ -28943,11 +28997,11 @@ fi
     ;;
 esac
 
-# Mips and HP-UX need the GNU assembler.
+# Mips, LoongArch and HP-UX need the GNU assembler.
 # Linux on IA64 might be able to use the Intel assembler.
 
 case "$target" in
-  mips*-*-* | *-*-hpux* )
+  mips*-*-* | loongarch*-*-* | *-*-hpux* )
     if test x$gas_flag = xyes \
        || test x"$host" != x"$build" \
        || test ! -x "$gcc_cv_as" \
@@ -29384,8 +29438,8 @@ esac
 # ??? Once 2.11 is released, probably need to add first known working
 # version to the per-target configury.
 case "$cpu_type" in
-  aarch64 | alpha | arc | arm | avr | bfin | cris | csky | i386 | m32c | m68k \
-  | microblaze | mips | nds32 | nios2 | pa | riscv | rs6000 | score | sparc \
+  aarch64 | alpha | arc | arm | avr | bfin | cris | csky | i386 | loongarch | m32c \
+  | m68k | microblaze | mips | nds32 | nios2 | pa | riscv | rs6000 | score | sparc \
   | tilegx | tilepro | visium | xstormy16 | xtensa)
     insn="nop"
     ;;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 04/12] LoongArch Port: Machine description files.
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (2 preceding siblings ...)
  2022-03-04  7:18 ` [PATCH v8 03/12] LoongArch Port: Regenerate gcc/configure xuchenghua
@ 2022-03-04  7:18 ` xuchenghua
  2022-03-06 16:16   ` Richard Sandiford
  2022-03-07 20:15   ` Xi Ruoyao
  2022-03-04  7:18 ` [PATCH v8 05/12] LoongArch Port: Machine description C files and .h files xuchenghua
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

gcc/
	* config/loongarch/constraints.md: New file.
	* config/loongarch/generic.md: New file.
	* config/loongarch/la464.md: New file.
	* config/loongarch/loongarch-ftypes.def: New file.
	* config/loongarch/loongarch-modes.def: New file.
	* config/loongarch/loongarch.md: New file.
	* config/loongarch/predicates.md: New file.
	* config/loongarch/sync.md: New file.
---
 gcc/config/loongarch/constraints.md       |  204 ++
 gcc/config/loongarch/generic.md           |  132 +
 gcc/config/loongarch/la464.md             |  132 +
 gcc/config/loongarch/loongarch-ftypes.def |  106 +
 gcc/config/loongarch/loongarch-modes.def  |   29 +
 gcc/config/loongarch/loongarch.md         | 3712 +++++++++++++++++++++
 gcc/config/loongarch/predicates.md        |  527 +++
 gcc/config/loongarch/sync.md              |  574 ++++
 8 files changed, 5416 insertions(+)
 create mode 100644 gcc/config/loongarch/constraints.md
 create mode 100644 gcc/config/loongarch/generic.md
 create mode 100644 gcc/config/loongarch/la464.md
 create mode 100644 gcc/config/loongarch/loongarch-ftypes.def
 create mode 100644 gcc/config/loongarch/loongarch-modes.def
 create mode 100644 gcc/config/loongarch/loongarch.md
 create mode 100644 gcc/config/loongarch/predicates.md
 create mode 100644 gcc/config/loongarch/sync.md

diff --git a/gcc/config/loongarch/constraints.md b/gcc/config/loongarch/constraints.md
new file mode 100644
index 00000000000..e3e3f79224b
--- /dev/null
+++ b/gcc/config/loongarch/constraints.md
@@ -0,0 +1,204 @@
+;; Constraint definitions for LoongArch.
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; Register constraints
+
+;; "a" "A constant call global and noplt address."
+;; "b" <-----unused
+;; "c" "A constant call local address."
+;; "d" <-----unused
+;; "e" JIRL_REGS
+;; "f" FP_REGS
+;; "g" <-----unused
+;; "h" "A constant call plt address."
+;; "i" "Matches a general integer constant." (Global non-architectural)
+;; "j" SIBCALL_REGS
+;; "k" "A memory operand whose address is formed by a base register and
+;;      (optionally scaled) index register."
+;; "l" "A signed 16-bit constant."
+;; "m" "A memory operand whose address is formed by a base register and offset
+;;      that is suitable for use in instructions with the same addressing mode
+;;      as @code{st.w} and @code{ld.w}."
+;; "n" "Matches a non-symbolic integer constant." (Global non-architectural)
+;; "o" "Matches an offsettable memory reference." (Global non-architectural)
+;; "p" "Matches a general address." (Global non-architectural)
+;; "q" CSR_REGS
+;; "r" GENERAL_REGS (Global non-architectural)
+;; "s" "Matches a symbolic integer constant." (Global non-architectural)
+;; "t" "A constant call weak address"
+;; "u" "A signed 52bit constant and low 32-bit is zero (for logic instructions)"
+;; "v" "A signed 64-bit constant and low 44-bit is zero (for logic instructions)."
+;; "w" "Matches any valid memory."
+;; "x" <-----unused
+;; "y" <-----unused
+;; "z" FCC_REGS
+;; "A" <-----unused
+;; "B" <-----unused
+;; "C" <-----unused
+;; "D" <-----unused
+;; "E" "Matches a floating-point constant." (Global non-architectural)
+;; "F" "Matches a floating-point constant." (Global non-architectural)
+;; "G" "Floating-point zero."
+;; "H" <-----unused
+;; "I" "A signed 12-bit constant (for arithmetic instructions)."
+;; "J" "Integer zero."
+;; "K" "An unsigned 12-bit constant (for logic instructions)."
+;; "L" <-----unused
+;; "M" <-----unused
+;; "N" <-----unused
+;; "O" <-----unused
+;; "P" <-----unused
+;; "Q" <-----unused
+;; "R" <-----unused
+;; "S" <-----unused
+;; "T" <-----unused
+;; "U" <-----unused
+;; "V" "Matches a non-offsettable memory reference." (Global non-architectural)
+;; "W" <-----unused
+;; "X" "Matches anything." (Global non-architectural)
+;; "Y" -
+;;    "Yd"
+;;       "A constant @code{move_operand} that can be safely loaded using
+;;	  @code{la}."
+;;    "Yx"
+;; "Z" -
+;;    "ZC"
+;;      "A memory operand whose address is formed by a base register and offset
+;;       that is suitable for use in instructions with the same addressing mode
+;;       as @code{ll.w} and @code{sc.w}."
+;;    "ZB"
+;;      "An address that is held in a general-purpose register.
+;;      The offset is zero"
+;; "<" "Matches a pre-dec or post-dec operand." (Global non-architectural)
+;; ">" "Matches a pre-inc or post-inc operand." (Global non-architectural)
+
+(define_constraint "a"
+  "@internal
+   A constant call global and noplt address."
+  (match_operand 0 "is_const_call_global_noplt_symbol"))
+
+(define_constraint "c"
+  "@internal
+   A constant call local address."
+  (match_operand 0 "is_const_call_local_symbol"))
+
+(define_register_constraint "e" "JIRL_REGS"
+  "@internal")
+
+(define_register_constraint "f" "TARGET_HARD_FLOAT ? FP_REGS : NO_REGS"
+  "A floating-point register (if available).")
+
+(define_constraint "h"
+  "@internal
+   A constant call plt address."
+  (match_operand 0 "is_const_call_plt_symbol"))
+
+(define_register_constraint "j" "SIBCALL_REGS"
+  "@internal")
+
+(define_memory_constraint "k"
+  "A memory operand whose address is formed by a base register and (optionally scaled)
+   index register."
+  (and (match_code "mem")
+       (not (match_test "loongarch_14bit_shifted_offset_address_p (XEXP (op, 0), mode)"))
+       (not (match_test "loongarch_12bit_offset_address_p (XEXP (op, 0), mode)"))))
+
+(define_constraint "l"
+"A signed 16-bit constant."
+(and (match_code "const_int")
+     (match_test "IMM16_OPERAND (ival)")))
+
+(define_memory_constraint "m"
+  "A memory operand whose address is formed by a base register and offset
+   that is suitable for use in instructions with the same addressing mode
+   as @code{st.w} and @code{ld.w}."
+  (and (match_code "mem")
+       (match_test "loongarch_12bit_offset_address_p (XEXP (op, 0), mode)")))
+
+(define_register_constraint "q" "CSR_REGS"
+  "A general-purpose register except for $r0 and $r1 for lcsr.")
+
+(define_constraint "t"
+  "@internal
+   A constant call weak address."
+  (match_operand 0 "is_const_call_weak_symbol"))
+
+(define_constraint "u"
+  "A signed 52bit constant and low 32-bit is zero (for logic instructions)."
+  (and (match_code "const_int")
+       (match_test "LU32I_OPERAND (ival)")))
+
+(define_constraint "v"
+  "A nsigned 64-bit constant and low 44-bit is zero (for logic instructions)."
+  (and (match_code "const_int")
+       (match_test "LU52I_OPERAND (ival)")))
+
+(define_register_constraint "z" "FCC_REGS"
+  "A floating-point condition code register.")
+
+;; Floating-point constraints
+
+(define_constraint "G"
+  "Floating-point zero."
+  (and (match_code "const_double")
+       (match_test "op == CONST0_RTX (mode)")))
+
+;; Integer constraints
+
+(define_constraint "I"
+  "A signed 12-bit constant (for arithmetic instructions)."
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND (ival)")))
+
+(define_constraint "J"
+  "Integer zero."
+  (and (match_code "const_int")
+       (match_test "ival == 0")))
+
+(define_constraint "K"
+  "An unsigned 12-bit constant (for logic instructions)."
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND_UNSIGNED (ival)")))
+
+(define_constraint "Yd"
+  "@internal
+   A constant @code{move_operand} that can be safely loaded using
+   @code{la}."
+  (and (match_operand 0 "move_operand")
+       (match_test "CONSTANT_P (op)")))
+
+(define_constraint "Yx"
+   "@internal"
+   (match_operand 0 "low_bitmask_operand"))
+
+(define_memory_constraint "ZC"
+  "A memory operand whose address is formed by a base register and offset
+   that is suitable for use in instructions with the same addressing mode
+   as @code{ll.w} and @code{sc.w}."
+  (and (match_code "mem")
+       (match_test "loongarch_14bit_shifted_offset_address_p (XEXP (op, 0), mode)")))
+
+(define_memory_constraint "ZB"
+  "@internal
+  An address that is held in a general-purpose register.
+  The offset is zero"
+  (and (match_code "mem")
+       (match_test "GET_CODE (XEXP (op,0)) == REG")))
+
diff --git a/gcc/config/loongarch/generic.md b/gcc/config/loongarch/generic.md
new file mode 100644
index 00000000000..a3a01f674f3
--- /dev/null
+++ b/gcc/config/loongarch/generic.md
@@ -0,0 +1,132 @@
+;; Generic DFA-based pipeline description for LoongArch targets
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+;; Based on MIPS target for GNU compiler.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+
+;; Pipeline descriptions.
+;;
+;; generic.md provides a fallback for processors without a specific
+;; pipeline description.  It is derived from the old define_function_unit
+;; version and uses the "alu" and "imuldiv" units declared below.
+;;
+;; Some of the processor-specific files are also derived from old
+;; define_function_unit descriptions and simply override the parts of
+;; generic.md that don't apply.  The other processor-specific files
+;; are self-contained.
+(define_automaton "alu,imuldiv")
+
+(define_cpu_unit "alu" "alu")
+(define_cpu_unit "imuldiv" "imuldiv")
+
+;; Ghost instructions produce no real code.
+;; They exist purely to express an effect on dataflow.
+(define_insn_reservation "ghost" 0
+  (eq_attr "type" "ghost")
+  "nothing")
+
+;; This file is derived from the old define_function_unit description.
+;; Each reservation can be overridden on a processor-by-processor basis.
+
+(define_insn_reservation "generic_alu" 1
+  (eq_attr "type" "unknown,prefetch,prefetchx,condmove,const,arith,
+		   shift,slt,clz,trap,multi,nop,logical,signext,move")
+  "alu")
+
+(define_insn_reservation "generic_load" 3
+  (eq_attr "type" "load,fpload,fpidxload")
+  "alu")
+
+(define_insn_reservation "generic_store" 1
+  (eq_attr "type" "store,fpstore,fpidxstore")
+  "alu")
+
+(define_insn_reservation "generic_xfer" 2
+  (eq_attr "type" "mftg,mgtf")
+  "alu")
+
+(define_insn_reservation "generic_branch" 1
+  (eq_attr "type" "branch,jump,call")
+  "alu")
+
+(define_insn_reservation "generic_imul" 17
+  (eq_attr "type" "imul")
+  "imuldiv*17")
+
+(define_insn_reservation "generic_fcvt" 1
+  (eq_attr "type" "fcvt")
+  "alu")
+
+(define_insn_reservation "generic_fmove" 2
+  (eq_attr "type" "fabs,fneg,fmove")
+  "alu")
+
+(define_insn_reservation "generic_fcmp" 3
+  (eq_attr "type" "fcmp")
+  "alu")
+
+(define_insn_reservation "generic_fadd" 4
+  (eq_attr "type" "fadd")
+  "alu")
+
+(define_insn_reservation "generic_fmul_single" 7
+  (and (eq_attr "type" "fmul,fmadd")
+       (eq_attr "mode" "SF"))
+  "alu")
+
+(define_insn_reservation "generic_fmul_double" 8
+  (and (eq_attr "type" "fmul,fmadd")
+       (eq_attr "mode" "DF"))
+  "alu")
+
+(define_insn_reservation "generic_fdiv_single" 23
+  (and (eq_attr "type" "fdiv,frdiv")
+       (eq_attr "mode" "SF"))
+  "alu")
+
+(define_insn_reservation "generic_fdiv_double" 36
+  (and (eq_attr "type" "fdiv,frdiv")
+       (eq_attr "mode" "DF"))
+  "alu")
+
+(define_insn_reservation "generic_fsqrt_single" 54
+  (and (eq_attr "type" "fsqrt,frsqrt")
+       (eq_attr "mode" "SF"))
+  "alu")
+
+(define_insn_reservation "generic_fsqrt_double" 112
+  (and (eq_attr "type" "fsqrt,frsqrt")
+       (eq_attr "mode" "DF"))
+  "alu")
+
+(define_insn_reservation "generic_atomic" 10
+  (eq_attr "type" "atomic")
+  "alu")
+
+;; Sync loop consists of (in order)
+;; (1) optional sync,
+;; (2) LL instruction,
+;; (3) branch and 1-2 ALU instructions,
+;; (4) SC instruction,
+;; (5) branch and ALU instruction.
+;; The net result of this reservation is a big delay with a flush of
+;; ALU pipeline.
+(define_insn_reservation "generic_sync_loop" 40
+  (eq_attr "type" "syncloop")
+  "alu*39")
diff --git a/gcc/config/loongarch/la464.md b/gcc/config/loongarch/la464.md
new file mode 100644
index 00000000000..ae3808b51bb
--- /dev/null
+++ b/gcc/config/loongarch/la464.md
@@ -0,0 +1,132 @@
+;; Pipeline model for LoongArch LA464 cores.
+
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; Uncomment the following line to output automata for debugging.
+;; (automata_option "v")
+
+;; Automaton for integer instructions.
+(define_automaton "la464_a_alu")
+
+;; Automaton for floating-point instructions.
+(define_automaton "la464_a_falu")
+
+;; Automaton for memory operations.
+(define_automaton "la464_a_mem")
+
+;; Describe the resources.
+
+(define_cpu_unit "la464_alu1" "la464_a_alu")
+(define_cpu_unit "la464_alu2" "la464_a_alu")
+(define_cpu_unit "la464_mem1" "la464_a_mem")
+(define_cpu_unit "la464_mem2" "la464_a_mem")
+(define_cpu_unit "la464_falu1" "la464_a_falu")
+(define_cpu_unit "la464_falu2" "la464_a_falu")
+
+;; Describe instruction reservations.
+
+(define_insn_reservation "la464_arith" 1
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "arith,clz,const,logical,
+			move,nop,shift,signext,slt"))
+  "la464_alu1 | la464_alu2")
+
+(define_insn_reservation "la464_branch" 1
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "branch,jump,call,condmove,trap"))
+  "la464_alu1 | la464_alu2")
+
+(define_insn_reservation "la464_imul" 7
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "imul"))
+  "la464_alu1 | la464_alu2")
+
+(define_insn_reservation "la464_idiv_si" 12
+  (and (match_test "TARGET_ARCH_LA464")
+       (and (eq_attr "type" "idiv")
+	    (eq_attr "mode" "SI")))
+  "la464_alu1 | la464_alu2")
+
+(define_insn_reservation "la464_idiv_di" 25
+  (and (match_test "TARGET_ARCH_LA464")
+       (and (eq_attr "type" "idiv")
+	    (eq_attr "mode" "DI")))
+  "la464_alu1 | la464_alu2")
+
+(define_insn_reservation "la464_load" 4
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "load"))
+  "la464_mem1 | la464_mem2")
+
+(define_insn_reservation "la464_gpr_fp" 16
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "mftg,mgtf"))
+  "la464_mem1")
+
+(define_insn_reservation "la464_fpload" 4
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "fpload"))
+  "la464_mem1 | la464_mem2")
+
+(define_insn_reservation "la464_prefetch" 0
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "prefetch,prefetchx"))
+  "la464_mem1 | la464_mem2")
+
+(define_insn_reservation "la464_store" 0
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "store,fpstore,fpidxstore"))
+  "la464_mem1 | la464_mem2")
+
+(define_insn_reservation "la464_fadd" 4
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "fadd,fmul,fmadd"))
+  "la464_falu1 | la464_falu2")
+
+(define_insn_reservation "la464_fcmp" 2
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "fabs,fcmp,fmove,fneg"))
+  "la464_falu1 | la464_falu2")
+
+(define_insn_reservation "la464_fcvt" 4
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "fcvt"))
+  "la464_falu1 | la464_falu2")
+
+(define_insn_reservation "la464_fdiv_sf" 12
+  (and (match_test "TARGET_ARCH_LA464")
+       (and (eq_attr "type" "fdiv,frdiv,fsqrt,frsqrt")
+	    (eq_attr "mode" "SF")))
+  "la464_falu1 | la464_falu2")
+
+(define_insn_reservation "la464_fdiv_df" 19
+  (and (match_test "TARGET_ARCH_LA464")
+       (and (eq_attr "type" "fdiv,frdiv,fsqrt,frsqrt")
+	    (eq_attr "mode" "DF")))
+  "la464_falu1 | la464_falu2")
+
+;; Force single-dispatch for unknown or multi.
+(define_insn_reservation "la464_unknown" 1
+  (and (match_test "TARGET_ARCH_LA464")
+       (eq_attr "type" "unknown,multi,atomic,syncloop"))
+  "la464_alu1 + la464_alu2 + la464_falu1
+   + la464_falu2 + la464_mem1 + la464_mem2")
+
+;; End of DFA-based pipeline description for la464
diff --git a/gcc/config/loongarch/loongarch-ftypes.def b/gcc/config/loongarch/loongarch-ftypes.def
new file mode 100644
index 00000000000..2137a3d2c9f
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-ftypes.def
@@ -0,0 +1,106 @@
+/* Definitions of prototypes for LoongArch built-in functions.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+   Based on MIPS target for GNU ckompiler.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* Invoke DEF_LARCH_FTYPE (NARGS, LIST) for each prototype used by
+   LoongArch built-in functions, where:
+
+      NARGS is the number of arguments.
+      LIST contains the return-type code followed by the codes for each
+      argument type.
+
+   Argument- and return-type codes are either modes or one of the following:
+
+      VOID for void_type_node
+      INT for integer_type_node
+      POINTER for ptr_type_node
+
+   (we don't use PTR because that's a ANSI-compatibillity macro).
+
+   Please keep this list lexicographically sorted by the LIST argument.  */
+
+DEF_LARCH_FTYPE (1, (DF, DF))
+DEF_LARCH_FTYPE (2, (DF, DF, DF))
+
+DEF_LARCH_FTYPE (1, (DI, DI))
+DEF_LARCH_FTYPE (2, (DI, DI, DI))
+DEF_LARCH_FTYPE (3, (DI, DI, DI, QI))
+DEF_LARCH_FTYPE (2, (DI, DI, SI))
+DEF_LARCH_FTYPE (3, (DI, DI, SI, SI))
+DEF_LARCH_FTYPE (2, (DI, DI, UQI))
+DEF_LARCH_FTYPE (3, (DI, DI, USI, USI))
+DEF_LARCH_FTYPE (2, (DI, POINTER, SI))
+DEF_LARCH_FTYPE (1, (DI, SI))
+DEF_LARCH_FTYPE (2, (DI, SI, SI))
+DEF_LARCH_FTYPE (1, (DI, UQI))
+DEF_LARCH_FTYPE (2, (DI, USI, USI))
+
+DEF_LARCH_FTYPE (2, (HI, HI, HI))
+
+DEF_LARCH_FTYPE (2, (INT, DF, DF))
+DEF_LARCH_FTYPE (2, (INT, SF, SF))
+
+DEF_LARCH_FTYPE (2, (QI, QI, QI))
+
+DEF_LARCH_FTYPE (1, (SF, SF))
+DEF_LARCH_FTYPE (2, (SF, SF, SF))
+
+DEF_LARCH_FTYPE (2, (SI, DI, SI))
+DEF_LARCH_FTYPE (2, (SI, HI, SI))
+DEF_LARCH_FTYPE (2, (SI, POINTER, SI))
+DEF_LARCH_FTYPE (2, (SI, QI, SI))
+DEF_LARCH_FTYPE (1, (SI, SI))
+DEF_LARCH_FTYPE (2, (SI, SI, SI))
+DEF_LARCH_FTYPE (3, (SI, SI, SI, QI))
+DEF_LARCH_FTYPE (3, (SI, SI, SI, SI))
+DEF_LARCH_FTYPE (2, (SI, SI, UQI))
+DEF_LARCH_FTYPE (1, (SI, UDI))
+DEF_LARCH_FTYPE (1, (SI, UQI))
+DEF_LARCH_FTYPE (1, (SI, VOID))
+
+DEF_LARCH_FTYPE (2, (UDI, UDI, UDI))
+DEF_LARCH_FTYPE (3, (UDI, UDI, UDI, USI))
+DEF_LARCH_FTYPE (2, (UDI, UDI, USI))
+DEF_LARCH_FTYPE (1, (UDI, USI))
+
+DEF_LARCH_FTYPE (1, (UHI, USI))
+
+DEF_LARCH_FTYPE (1, (UQI, USI))
+
+DEF_LARCH_FTYPE (1, (USI, UQI))
+DEF_LARCH_FTYPE (1, (USI, USI))
+DEF_LARCH_FTYPE (2, (USI, USI, USI))
+DEF_LARCH_FTYPE (3, (USI, USI, USI, USI))
+DEF_LARCH_FTYPE (1, (USI, VOID))
+
+DEF_LARCH_FTYPE (2, (VOID, DI, DI))
+DEF_LARCH_FTYPE (2, (VOID, DI, UQI))
+DEF_LARCH_FTYPE (2, (VOID, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (VOID, SI, SI))
+DEF_LARCH_FTYPE (2, (VOID, SI, UQI))
+DEF_LARCH_FTYPE (2, (VOID, UDI, USI))
+DEF_LARCH_FTYPE (2, (VOID, UHI, USI))
+DEF_LARCH_FTYPE (2, (VOID, UQI, SI))
+DEF_LARCH_FTYPE (2, (VOID, UQI, USI))
+DEF_LARCH_FTYPE (1, (VOID, USI))
+DEF_LARCH_FTYPE (3, (VOID, USI, UDI, SI))
+DEF_LARCH_FTYPE (2, (VOID, USI, UQI))
+DEF_LARCH_FTYPE (2, (VOID, USI, USI))
+DEF_LARCH_FTYPE (3, (VOID, USI, USI, SI))
diff --git a/gcc/config/loongarch/loongarch-modes.def b/gcc/config/loongarch/loongarch-modes.def
new file mode 100644
index 00000000000..c0261b2cc83
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-modes.def
@@ -0,0 +1,29 @@
+/* LoongArch extra machine modes.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+   Based on MIPS target for GNU compiler.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+FLOAT_MODE (TF, 16, ieee_quad_format);
+
+VECTOR_MODES (FLOAT, 8);      /*       V4HF V2SF */
+
+/* For floating point conditions in FCC registers.  */
+CC_MODE (FCC);
+
+INT_MODE (OI, 32);
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
new file mode 100644
index 00000000000..a9a8bc4b038
--- /dev/null
+++ b/gcc/config/loongarch/loongarch.md
@@ -0,0 +1,3712 @@
+;; Machine Description for LoongArch for GNU compiler.
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+;; Based on MIPS target for GNU compiler.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_c_enum "unspec" [
+  ;; Integer operations that are too cumbersome to describe directly.
+  UNSPEC_REVB_2H
+  UNSPEC_REVB_4H
+  UNSPEC_REVH_D
+
+  ;; Floating-point moves.
+  UNSPEC_LOAD_LOW
+  UNSPEC_LOAD_HIGH
+  UNSPEC_STORE_WORD
+  UNSPEC_MOVGR2FRH
+  UNSPEC_MOVFRH2GR
+
+  ;; Floating point unspecs.
+  UNSPEC_FRINT
+  UNSPEC_FCLASS
+
+  ;; Override return address for exception handling.
+  UNSPEC_EH_RETURN
+
+  ;; Bit operation
+  UNSPEC_BYTEPICK_W
+  UNSPEC_BYTEPICK_D
+  UNSPEC_BITREV_4B
+  UNSPEC_BITREV_8B
+
+  ;; TLS
+  UNSPEC_TLS_GD
+  UNSPEC_TLS_LD
+  UNSPEC_TLS_LE
+  UNSPEC_TLS_IE
+
+  ;; Stack tie
+  UNSPEC_TIE
+
+  ;; CRC
+  UNSPEC_CRC
+  UNSPEC_CRCC
+])
+
+(define_c_enum "unspecv" [
+  ;; Blockage and synchronisation.
+  UNSPECV_BLOCKAGE
+  UNSPECV_DBAR
+  UNSPECV_IBAR
+
+  ;; CPUCFG
+  UNSPECV_CPUCFG
+  UNSPECV_ASRTLE_D
+  UNSPECV_ASRTGT_D
+
+  ;; Privileged instructions
+  UNSPECV_CSRRD
+  UNSPECV_CSRWR
+  UNSPECV_CSRXCHG
+  UNSPECV_IOCSRRD
+  UNSPECV_IOCSRWR
+  UNSPECV_CACOP
+  UNSPECV_LDDIR
+  UNSPECV_LDPTE
+  UNSPECV_ERTN
+
+  ;; Stack checking.
+  UNSPECV_PROBE_STACK_RANGE
+
+  ;; Floating-point environment.
+  UNSPECV_MOVFCSR2GR
+  UNSPECV_MOVGR2FCSR
+
+])
+
+(define_constants
+  [(RETURN_ADDR_REGNUM		1)
+   (T0_REGNUM			12)
+   (T1_REGNUM			13)
+   (S0_REGNUM			23)
+
+   ;; PIC long branch sequences are never longer than 100 bytes.
+   (MAX_PIC_BRANCH_LENGTH	100)
+])
+
+(include "predicates.md")
+(include "constraints.md")
+\f
+;; ....................
+;;
+;;	Attributes
+;;
+;; ....................
+
+(define_attr "got" "unset,load"
+  (const_string "unset"))
+
+;; For jirl instructions, this attribute is DIRECT when the target address
+;; is symbolic and INDIRECT when it is a register.
+(define_attr "jirl" "unset,direct,indirect"
+  (const_string "unset"))
+
+
+;; Classification of moves, extensions and truncations.  Most values
+;; are as for "type" (see below) but there are also the following
+;; move-specific values:
+;;
+;; sll0		"slli.w DEST,SRC,0", which on 64-bit targets is guaranteed
+;;		to produce a sign-extended DEST, even if SRC is not
+;;		properly sign-extended
+;; pick_ins	BSTRPICK.W, BSTRPICK.D, BSTRINS.W or BSTRINS.D instruction
+;; andi		a single ANDI instruction
+;; shift_shift	a shift left followed by a shift right
+;;
+;; This attribute is used to determine the instruction's length and
+;; scheduling type.  For doubleword moves, the attribute always describes
+;; the split instructions; in some cases, it is more appropriate for the
+;; scheduling type to be "multi" instead.
+(define_attr "move_type"
+  "unknown,load,fpload,store,fpstore,mgtf,mftg,imul,move,fmove,
+   const,signext,pick_ins,logical,arith,sll0,andi,shift_shift"
+  (const_string "unknown"))
+
+(define_attr "alu_type" "unknown,add,sub,not,nor,and,or,xor"
+  (const_string "unknown"))
+
+;; Main data type used by the insn
+(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,OI,SF,DF,TF,FCC"
+  (const_string "unknown"))
+
+;; True if the main data type is twice the size of a word.
+(define_attr "dword_mode" "no,yes"
+  (cond [(and (eq_attr "mode" "DI,DF")
+	      (not (match_test "TARGET_64BIT")))
+	 (const_string "yes")
+
+	 (and (eq_attr "mode" "TI,TF")
+	      (match_test "TARGET_64BIT"))
+	 (const_string "yes")]
+	(const_string "no")))
+
+;; True if the main data type is four times of the size of a word.
+(define_attr "qword_mode" "no,yes"
+  (cond [(and (eq_attr "mode" "TI,TF")
+	      (not (match_test "TARGET_64BIT")))
+	 (const_string "yes")]
+	(const_string "no")))
+
+;; True if the main data type is eight times of the size of a word.
+(define_attr "oword_mode" "no,yes"
+  (cond [(and (eq_attr "mode" "OI")
+	      (not (match_test "TARGET_64BIT")))
+	 (const_string "yes")]
+	(const_string "no")))
+
+;; Classification of each insn.
+;; branch	conditional branch
+;; jump		unconditional jump
+;; call		unconditional call
+;; load		load instruction(s)
+;; fpload	floating point load
+;; fpidxload    floating point indexed load
+;; store	store instruction(s)
+;; fpstore	floating point store
+;; fpidxstore	floating point indexed store
+;; prefetch	memory prefetch (register + offset)
+;; prefetchx	memory indexed prefetch (register + register)
+;; condmove	conditional moves
+;; mgtf		move general-purpose register to floating point register
+;; mftg		move floating point register to general-purpose register
+;; const	load constant
+;; arith	integer arithmetic instructions
+;; logical      integer logical instructions
+;; shift	integer shift instructions
+;; slt		set less than instructions
+;; signext      sign extend instructions
+;; clz		the clz and clo instructions
+;; trap		trap if instructions
+;; imul		integer multiply
+;; idiv		integer divide
+;; move		integer move
+;; fmove	floating point register move
+;; fadd		floating point add/subtract
+;; fmul		floating point multiply
+;; fmadd	floating point multiply-add
+;; fdiv		floating point divide
+;; frdiv	floating point reciprocal divide
+;; fabs		floating point absolute value
+;; fneg		floating point negation
+;; fcmp		floating point compare
+;; fcvt		floating point convert
+;; fsqrt	floating point square root
+;; frsqrt       floating point reciprocal square root
+;; multi	multiword sequence (or user asm statements)
+;; atomic	atomic memory update instruction
+;; syncloop	memory atomic operation implemented as a sync loop
+;; nop		no operation
+;; ghost	an instruction that produces no real code
+(define_attr "type"
+  "unknown,branch,jump,call,load,fpload,fpidxload,store,fpstore,fpidxstore,
+   prefetch,prefetchx,condmove,mgtf,mftg,const,arith,logical,
+   shift,slt,signext,clz,trap,imul,idiv,move,
+   fmove,fadd,fmul,fmadd,fdiv,frdiv,fabs,fneg,fcmp,fcvt,fsqrt,
+   frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost"
+  (cond [(eq_attr "jirl" "!unset") (const_string "call")
+	 (eq_attr "got" "load") (const_string "load")
+
+	 (eq_attr "alu_type" "add,sub") (const_string "arith")
+
+	 (eq_attr "alu_type" "not,nor,and,or,xor") (const_string "logical")
+
+	 ;; If a doubleword move uses these expensive instructions,
+	 ;; it is usually better to schedule them in the same way
+	 ;; as the singleword form, rather than as "multi".
+	 (eq_attr "move_type" "load") (const_string "load")
+	 (eq_attr "move_type" "fpload") (const_string "fpload")
+	 (eq_attr "move_type" "store") (const_string "store")
+	 (eq_attr "move_type" "fpstore") (const_string "fpstore")
+	 (eq_attr "move_type" "mgtf") (const_string "mgtf")
+	 (eq_attr "move_type" "mftg") (const_string "mftg")
+
+	 ;; These types of move are always single insns.
+	 (eq_attr "move_type" "imul") (const_string "imul")
+	 (eq_attr "move_type" "fmove") (const_string "fmove")
+	 (eq_attr "move_type" "signext") (const_string "signext")
+	 (eq_attr "move_type" "pick_ins") (const_string "arith")
+	 (eq_attr "move_type" "arith") (const_string "arith")
+	 (eq_attr "move_type" "logical") (const_string "logical")
+	 (eq_attr "move_type" "sll0") (const_string "shift")
+	 (eq_attr "move_type" "andi") (const_string "logical")
+
+	 ;; These types of move are always split.
+	 (eq_attr "move_type" "shift_shift")
+	   (const_string "multi")
+
+	 ;; These types of move are split for octaword modes only.
+	 (and (eq_attr "move_type" "move,const")
+	      (eq_attr "oword_mode" "yes"))
+	   (const_string "multi")
+
+	 ;; These types of move are split for quadword modes only.
+	 (and (eq_attr "move_type" "move,const")
+	      (eq_attr "qword_mode" "yes"))
+	   (const_string "multi")
+
+	 ;; These types of move are split for doubleword modes only.
+	 (and (eq_attr "move_type" "move,const")
+	      (eq_attr "dword_mode" "yes"))
+	   (const_string "multi")
+	 (eq_attr "move_type" "move") (const_string "move")
+	 (eq_attr "move_type" "const") (const_string "const")]
+	(const_string "unknown")))
+
+;; Mode for conversion types (fcvt)
+;; I2S	integer to float single (SI/DI to SF)
+;; I2D	integer to float double (SI/DI to DF)
+;; S2I	float to integer (SF to SI/DI)
+;; D2I	float to integer (DF to SI/DI)
+;; D2S	double to float single
+;; S2D	float single to double
+
+(define_attr "cnv_mode" "unknown,I2S,I2D,S2I,D2I,D2S,S2D"
+  (const_string "unknown"))
+
+(define_attr "compression" "none,all"
+  (const_string "none"))
+
+;; The number of individual instructions that a non-branch pattern generates
+(define_attr "insn_count" ""
+  (cond [;; "Ghost" instructions occupy no space.
+	 (eq_attr "type" "ghost")
+	 (const_int 0)
+
+	 ;; Check for doubleword moves that are decomposed into two
+	 ;; instructions.
+	 (and (eq_attr "move_type" "mgtf,mftg,move")
+	      (eq_attr "dword_mode" "yes"))
+	 (const_int 2)
+
+	 ;; Check for quadword moves that are decomposed into four
+	 ;; instructions.
+	 (and (eq_attr "move_type" "mgtf,mftg,move")
+	      (eq_attr "qword_mode" "yes"))
+	 (const_int 4)
+
+	 ;; Check for Octaword moves that are decomposed into eight
+	 ;; instructions.
+	 (and (eq_attr "move_type" "mgtf,mftg,move")
+	      (eq_attr "oword_mode" "yes"))
+	 (const_int 8)
+
+	 ;; Constants, loads and stores are handled by external routines.
+	 (and (eq_attr "move_type" "const")
+	      (eq_attr "dword_mode" "yes"))
+	 (symbol_ref "loongarch_split_const_insns (operands[1])")
+	 (eq_attr "move_type" "const")
+	 (symbol_ref "loongarch_const_insns (operands[1])")
+	 (eq_attr "move_type" "load,fpload")
+	 (symbol_ref "loongarch_load_store_insns (operands[1], insn)")
+	 (eq_attr "move_type" "store,fpstore")
+	 (symbol_ref "loongarch_load_store_insns (operands[0], insn)")
+
+	 (eq_attr "type" "idiv")
+	 (symbol_ref "loongarch_idiv_insns (GET_MODE (PATTERN (insn)))")]
+(const_int 1)))
+
+;; Length of instruction in bytes.
+(define_attr "length" ""
+   (cond [
+	  ;; Branch futher than +/- 128 KiB require two instructions.
+	  (eq_attr "type" "branch")
+	  (if_then_else (and (le (minus (match_dup 0) (pc)) (const_int 131064))
+			     (le (minus (pc) (match_dup 0)) (const_int 131068)))
+	  (const_int 4)
+	  (const_int 8))
+	  ](symbol_ref "get_attr_insn_count (insn) * 4")))
+
+;; Describe a user's asm statement.
+(define_asm_attributes
+  [(set_attr "type" "multi")])
+\f
+;; This mode iterator allows 32-bit and 64-bit GPR patterns to be generated
+;; from the same template.
+(define_mode_iterator GPR [SI (DI "TARGET_64BIT")])
+
+;; A copy of GPR that can be used when a pattern has two independent
+;; modes.
+(define_mode_iterator GPR2 [SI (DI "TARGET_64BIT")])
+
+;; Likewise, but for XLEN-sized quantities.
+(define_mode_iterator X [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")])
+
+;; This mode iterator allows 16-bit and 32-bit GPR patterns and 32-bit 64-bit
+;; FPR patterns to be generated from the same template.
+(define_mode_iterator JOIN_MODE [HI
+				 SI
+				 (SF "TARGET_HARD_FLOAT")
+				 (DF "TARGET_DOUBLE_FLOAT")])
+
+;; This mode iterator allows :P to be used for patterns that operate on
+;; pointer-sized quantities.  Exactly one of the two alternatives will match.
+(define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")])
+
+;; 64-bit modes for which we provide move patterns.
+(define_mode_iterator MOVE64 [DI DF])
+
+;; 128-bit modes for which we provide move patterns on 64-bit targets.
+(define_mode_iterator MOVE128 [TI TF])
+
+;; Iterator for sub-32-bit integer modes.
+(define_mode_iterator SHORT [QI HI])
+
+;; Likewise the 64-bit truncate-and-shift patterns.
+(define_mode_iterator SUBDI [QI HI SI])
+
+;; This mode iterator allows the QI HI SI and DI extension patterns to be
+(define_mode_iterator QHWD [QI HI SI (DI "TARGET_64BIT")])
+
+;; Iterator for hardware-supported floating-point modes.
+(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
+			    (DF "TARGET_DOUBLE_FLOAT")])
+
+;; A floating-point mode for which moves involving FPRs may need to be split.
+(define_mode_iterator SPLITF
+  [(DF "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
+   (DI "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
+   (TF "TARGET_64BIT && TARGET_DOUBLE_FLOAT")])
+
+;; In GPR templates, a string like "mul.<d>" will expand to "mul" in the
+;; 32-bit "mul.w" and "mul.d" in the 64-bit version.
+(define_mode_attr d [(SI "w") (DI "d")])
+
+;; This attribute gives the length suffix for a load or store instruction.
+;; The same suffixes work for zero and sign extensions.
+(define_mode_attr size [(QI "b") (HI "h") (SI "w") (DI "d")])
+(define_mode_attr SIZE [(QI "B") (HI "H") (SI "W") (DI "D")])
+
+;; This attributes gives the mode mask of a SHORT.
+(define_mode_attr mask [(QI "0x00ff") (HI "0xffff")])
+
+;; This attributes gives the size (bits) of a SHORT.
+(define_mode_attr qi_hi [(QI "7") (HI "15")])
+
+;; Instruction names for stores.
+(define_mode_attr store [(QI "sb") (HI "sh") (SI "sw") (DI "sd")])
+
+;; Similarly for LoongArch indexed FPR loads and stores.
+(define_mode_attr floadx [(SF "fldx.s") (DF "fldx.d") (V2SF "fldx.d")])
+(define_mode_attr fstorex [(SF "fstx.s") (DF "fstx.d") (V2SF "fstx.d")])
+
+;; Similarly for LoongArch indexed GPR loads and stores.
+(define_mode_attr loadx [(QI "ldx.b")
+			 (HI "ldx.h")
+			 (SI "ldx.w")
+			 (DI "ldx.d")])
+(define_mode_attr storex [(QI "stx.b")
+			  (HI "stx.h")
+			  (SI "stx.w")
+			  (DI "stx.d")])
+
+;; This attribute gives the format suffix for floating-point operations.
+(define_mode_attr fmt [(SF "s") (DF "d")])
+
+;; This attribute gives the upper-case mode name for one unit of a
+;; floating-point mode or vector mode.
+(define_mode_attr UNITMODE [(SF "SF") (DF "DF") (V2SF "SF")])
+
+;; This attribute gives the integer mode that has half the size of
+;; the controlling mode.
+(define_mode_attr HALFMODE [(DF "SI") (DI "SI") (V2SF "SI") (TF "DI")])
+
+;; This attribute gives the integer prefix for some instructions templates.
+(define_mode_attr p [(SI "") (DI "d")])
+
+;; This code iterator allows signed and unsigned widening multiplications
+;; to use the same template.
+(define_code_iterator any_extend [sign_extend zero_extend])
+
+;; This code iterator allows the two right shift instructions to be
+;; generated from the same template.
+(define_code_iterator any_shiftrt [ashiftrt lshiftrt])
+
+;; This code iterator allows the three shift instructions to be generated
+;; from the same template.
+(define_code_iterator any_shift [ashift ashiftrt lshiftrt])
+
+;; This code iterator allows the three bitwise instructions to be generated
+;; from the same template.
+(define_code_iterator any_bitwise [and ior xor])
+
+;; This code iterator allows unsigned and signed division to be generated
+;; from the same template.
+(define_code_iterator any_div [div udiv mod umod])
+
+;; This code iterator allows all native floating-point comparisons to be
+;; generated from the same template.
+(define_code_iterator fcond [unordered uneq unlt unle eq lt le
+			     ordered ltgt ne ge gt unge ungt])
+
+;; Equality operators.
+(define_code_iterator equality_op [eq ne])
+
+;; These code iterators allow the signed and unsigned scc operations to use
+;; the same template.
+(define_code_iterator any_gt [gt gtu])
+(define_code_iterator any_ge [ge geu])
+(define_code_iterator any_lt [lt ltu])
+(define_code_iterator any_le [le leu])
+
+(define_code_iterator any_return [return simple_return])
+
+;; <u> expands to an empty string when doing a signed operation and
+;; "u" when doing an unsigned operation.
+(define_code_attr u [(sign_extend "") (zero_extend "u")
+		     (div "") (udiv "u")
+		     (mod "") (umod "u")
+		     (gt "") (gtu "u")
+		     (ge "") (geu "u")
+		     (lt "") (ltu "u")
+		     (le "") (leu "u")])
+
+;; <U> is like <u> except uppercase.
+(define_code_attr U [(sign_extend "") (zero_extend "U")])
+
+;; <su> is like <u>, but the signed form expands to "s" rather than "".
+(define_code_attr su [(sign_extend "s") (zero_extend "u")])
+
+;; <optab> expands to the name of the optab for a particular code.
+(define_code_attr optab [(ashift "ashl")
+			 (ashiftrt "ashr")
+			 (lshiftrt "lshr")
+			 (ior "ior")
+			 (xor "xor")
+			 (and "and")
+			 (plus "add")
+			 (minus "sub")
+			 (mult "mul")
+			 (div "div")
+			 (udiv "udiv")
+			 (mod "mod")
+			 (umod "umod")
+			 (return "return")
+			 (simple_return "simple_return")])
+
+;; <insn> expands to the name of the insn that implements a particular code.
+(define_code_attr insn [(ashift "sll")
+			(ashiftrt "sra")
+			(lshiftrt "srl")
+			(ior "or")
+			(xor "xor")
+			(and "and")
+			(plus "addu")
+			(minus "subu")
+			(div "div")
+			(udiv "div")
+			(mod "mod")
+			(umod "mod")])
+
+;; <fcond> is the fcmp.cond.fmt condition associated with a particular code.
+(define_code_attr fcond [(unordered "cun")
+			 (uneq "cueq")
+			 (unlt "cult")
+			 (unle "cule")
+			 (eq "ceq")
+			 (lt "slt")
+			 (le "sle")
+			 (ordered "cor")
+			 (ltgt "sne")
+			 (ne "cune")
+			 (ge "sge")
+			 (gt "sgt")
+			 (unge "cuge")
+			 (ungt "cugt")])
+
+;; The sel mnemonic to use depending on the condition test.
+(define_code_attr sel [(eq "masknez") (ne "maskeqz")])
+(define_code_attr selinv [(eq "maskeqz") (ne "masknez")])
+
+;;
+;;  ....................
+;;
+;;	CONDITIONAL TRAPS
+;;
+;;  ....................
+;;
+
+(define_insn "trap"
+  [(trap_if (const_int 1) (const_int 0))]
+  ""
+{
+  return "break\t0";
+}
+  [(set_attr "type" "trap")])
+
+
+\f
+;;
+;;  ....................
+;;
+;;	ADDITION
+;;
+;;  ....................
+;;
+
+(define_insn "add<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(plus:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		   (match_operand:ANYF 2 "register_operand" "f")))]
+  ""
+  "fadd.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+(define_insn "add<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r")
+	(plus:GPR (match_operand:GPR 1 "register_operand" "r,r")
+		  (match_operand:GPR 2 "arith_operand" "r,I")))]
+  ""
+  "add%i2.<d>\t%0,%1,%2";
+  [(set_attr "alu_type" "add")
+   (set_attr "compression" "*,*")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*addsi3_extended"
+  [(set (match_operand:DI 0 "register_operand" "=r,r")
+	(sign_extend:DI
+	     (plus:SI (match_operand:SI 1 "register_operand" "r,r")
+		      (match_operand:SI 2 "arith_operand" "r,I"))))]
+  "TARGET_64BIT"
+  "add%i2.w\t%0,%1,%2"
+  [(set_attr "alu_type" "add")
+   (set_attr "mode" "SI")])
+
+(define_insn "*addsi3_extended2"
+  [(set (match_operand:DI 0 "register_operand" "=r,r")
+	(sign_extend:DI
+	  (subreg:SI (plus:DI (match_operand:DI 1 "register_operand" "r,r")
+			      (match_operand:DI 2 "arith_operand"    "r,I"))
+		     0)))]
+  "TARGET_64BIT"
+  "add%i2.w\t%0,%1,%2"
+  [(set_attr "alu_type" "add")
+   (set_attr "mode" "SI")])
+
+\f
+;;
+;;  ....................
+;;
+;;	SUBTRACTION
+;;
+;;  ....................
+;;
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(minus:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		    (match_operand:ANYF 2 "register_operand" "f")))]
+  ""
+  "fsub.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(minus:GPR (match_operand:GPR 1 "register_operand" "rJ")
+		   (match_operand:GPR 2 "register_operand" "r")))]
+  ""
+  "sub.<d>\t%0,%z1,%2"
+  [(set_attr "alu_type" "sub")
+   (set_attr "compression" "*")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*subsi3_extended"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	    (minus:SI (match_operand:SI 1 "register_operand" "rJ")
+		      (match_operand:SI 2 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "sub.w\t%0,%z1,%2"
+  [(set_attr "alu_type" "sub")
+   (set_attr "mode" "DI")])
+
+(define_insn "*subsi3_extended2"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	  (subreg:SI (minus:DI (match_operand:DI 1 "reg_or_0_operand" "rJ")
+			       (match_operand:DI 2 "register_operand" "r"))
+		     0)))]
+  "TARGET_64BIT"
+  "sub.w\t%0,%z1,%2"
+  [(set_attr "alu_type" "sub")
+   (set_attr "mode" "SI")])
+
+\f
+;;
+;;  ....................
+;;
+;;	MULTIPLICATION
+;;
+;;  ....................
+;;
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(mult:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		   (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fmul.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(mult:GPR (match_operand:GPR 1 "register_operand" "r")
+		  (match_operand:GPR 2 "register_operand" "r")))]
+  ""
+  "mul.<d>\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mulsidi3_64bit"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "r"))
+		 (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "mul.d\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "DI")])
+
+(define_insn "*mulsi3_extended"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	    (mult:SI (match_operand:SI 1 "register_operand" "r")
+		     (match_operand:SI 2 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "mul.w\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "SI")])
+
+(define_insn "*mulsi3_extended2"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	  (subreg:SI (mult:DI (match_operand:DI 1 "register_operand" "r")
+			      (match_operand:DI 2 "register_operand" "r"))
+		     0)))]
+  "TARGET_64BIT"
+  "mul.w\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "SI")])
+
+
+;;
+;;  ........................
+;;
+;;	MULTIPLICATION HIGH-PART
+;;
+;;  ........................
+;;
+
+
+(define_expand "<u>mulditi3"
+  [(set (match_operand:TI 0 "register_operand")
+	(mult:TI (any_extend:TI (match_operand:DI 1 "register_operand"))
+		 (any_extend:TI (match_operand:DI 2 "register_operand"))))]
+  "TARGET_64BIT"
+{
+  rtx low = gen_reg_rtx (DImode);
+  emit_insn (gen_muldi3 (low, operands[1], operands[2]));
+
+  rtx high = gen_reg_rtx (DImode);
+  emit_insn (gen_<u>muldi3_highpart (high, operands[1], operands[2]));
+
+  emit_move_insn (gen_lowpart (DImode, operands[0]), low);
+  emit_move_insn (gen_highpart (DImode, operands[0]), high);
+  DONE;
+})
+
+(define_insn "<u>muldi3_highpart"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(truncate:DI
+	  (lshiftrt:TI
+	    (mult:TI (any_extend:TI
+		       (match_operand:DI 1 "register_operand" " r"))
+		     (any_extend:TI
+		       (match_operand:DI 2 "register_operand" " r")))
+	    (const_int 64))))]
+  "TARGET_64BIT"
+  "mulh.d<u>\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "DI")])
+
+(define_expand "<u>mulsidi3"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(mult:DI (any_extend:DI
+		   (match_operand:SI 1 "register_operand" " r"))
+		 (any_extend:DI
+		   (match_operand:SI 2 "register_operand" " r"))))]
+  "!TARGET_64BIT"
+{
+  rtx temp = gen_reg_rtx (SImode);
+  emit_insn (gen_mulsi3 (temp, operands[1], operands[2]));
+  emit_insn (gen_<u>mulsi3_highpart (loongarch_subword (operands[0], true),
+				     operands[1], operands[2]));
+  emit_insn (gen_movsi (loongarch_subword (operands[0], false), temp));
+  DONE;
+})
+
+(define_insn "<u>mulsi3_highpart"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(truncate:SI
+	  (lshiftrt:DI
+	    (mult:DI (any_extend:DI
+		       (match_operand:SI 1 "register_operand" " r"))
+		     (any_extend:DI
+		       (match_operand:SI 2 "register_operand" " r")))
+	    (const_int 32))))]
+  "!TARGET_64BIT"
+  "mulh.w<u>\t%0,%1,%2"
+  [(set_attr "type" "imul")
+   (set_attr "mode" "SI")])
+
+;;
+;;  ....................
+;;
+;;	DIVISION and REMAINDER
+;;
+;;  ....................
+;;
+
+;; Float division and modulus.
+(define_expand "div<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand")
+	(div:ANYF (match_operand:ANYF 1 "reg_or_1_operand")
+		  (match_operand:ANYF 2 "register_operand")))]
+  "TARGET_HARD_FLOAT"
+{})
+
+(define_insn "*div<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(div:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		  (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fdiv.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fdiv")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+
+;; In 3A5000, the reciprocal operation is the same as the division operation.
+
+(define_insn "*recip<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
+		  (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "frecip.<fmt>\t%0,%2"
+  [(set_attr "type" "frdiv")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+
+;; Integer division and modulus.
+
+(define_insn "<optab><mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+	(any_div:GPR (match_operand:GPR 1 "register_operand" "r")
+		     (match_operand:GPR 2 "register_operand" "r")))]
+  ""
+  {
+    return loongarch_output_division ("<insn>.<d><u>\t%0,%1,%2", operands);
+  }
+  [(set_attr "type" "idiv")
+   (set_attr "mode" "<MODE>")])
+
+
+;; Floating point multiply accumulate instructions.
+
+;; a * b + c
+(define_expand "fma<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand")
+	(fma:ANYF (match_operand:ANYF 1 "register_operand")
+		  (match_operand:ANYF 2 "register_operand")
+		  (match_operand:ANYF 3 "register_operand")))]
+  "TARGET_HARD_FLOAT")
+
+(define_insn "*fma<mode>4_madd4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(fma:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		  (match_operand:ANYF 2 "register_operand" "f")
+		  (match_operand:ANYF 3 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fmadd.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; a * b - c
+(define_insn "fms<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(fma:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		  (match_operand:ANYF 2 "register_operand" "f")
+		  (neg:ANYF (match_operand:ANYF 3 "register_operand" "f"))))]
+  "TARGET_HARD_FLOAT"
+  "fmsub.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; fnma is defined in GCC as (fma (neg op1) op2 op3)
+;; (-op1 * op2) + op3 ==> -(op1 * op2) + op3 ==> -((op1 * op2) - op3)
+;; The loongarch nmsub instructions implement -((op1 * op2) - op3)
+;; This transformation means we may return the wrong signed zero
+;; so we check HONOR_SIGNED_ZEROS.
+
+;; -a * b + c
+(define_insn "fnma<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(fma:ANYF (neg:ANYF (match_operand:ANYF 1 "register_operand" "f"))
+		  (match_operand:ANYF 2 "register_operand" "f")
+		  (match_operand:ANYF 3 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT && !HONOR_SIGNED_ZEROS (<MODE>mode)"
+  "fnmsub.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; fnms is defined as: (fma (neg op1) op2 (neg op3))
+;; ((-op1) * op2) - op3 ==> -(op1 * op2) - op3 ==> -((op1 * op2) + op3)
+;; The loongarch nmadd instructions implement -((op1 * op2) + op3)
+;; This transformation means we may return the wrong signed zero
+;; so we check HONOR_SIGNED_ZEROS.
+
+;; -a * b - c
+(define_insn "fnms<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(fma:ANYF
+	  (neg:ANYF (match_operand:ANYF 1 "register_operand" "f"))
+	  (match_operand:ANYF 2 "register_operand" "f")
+	  (neg:ANYF (match_operand:ANYF 3 "register_operand" "f"))))]
+  "TARGET_HARD_FLOAT && !HONOR_SIGNED_ZEROS (<MODE>mode)"
+  "fnmadd.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; -(-a * b - c), modulo signed zeros
+(define_insn "*fma<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(neg:ANYF
+	    (fma:ANYF
+		(neg:ANYF (match_operand:ANYF 1 "register_operand" " f"))
+		(match_operand:ANYF 2 "register_operand" " f")
+		(neg:ANYF (match_operand:ANYF 3 "register_operand" " f")))))]
+  "TARGET_HARD_FLOAT && !HONOR_SIGNED_ZEROS (<MODE>mode)"
+  "fmadd.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; -(-a * b + c), modulo signed zeros
+(define_insn "*fms<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(neg:ANYF
+	    (fma:ANYF
+		(neg:ANYF (match_operand:ANYF 1 "register_operand" " f"))
+		(match_operand:ANYF 2 "register_operand" " f")
+		(match_operand:ANYF 3 "register_operand" " f"))))]
+  "TARGET_HARD_FLOAT && !HONOR_SIGNED_ZEROS (<MODE>mode)"
+  "fmsub.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; -(a * b + c)
+(define_insn "*fnms<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(neg:ANYF
+	    (fma:ANYF
+		(match_operand:ANYF 1 "register_operand" " f")
+		(match_operand:ANYF 2 "register_operand" " f")
+		(match_operand:ANYF 3 "register_operand" " f"))))]
+  "TARGET_HARD_FLOAT"
+  "fnmadd.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+
+;; -(a * b - c)
+(define_insn "*fnma<mode>4"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(neg:ANYF
+	    (fma:ANYF
+		(match_operand:ANYF 1 "register_operand" " f")
+		(match_operand:ANYF 2 "register_operand" " f")
+		(neg:ANYF (match_operand:ANYF 3 "register_operand" " f")))))]
+  "TARGET_HARD_FLOAT"
+  "fnmsub.<fmt>\t%0,%1,%2,%3"
+  [(set_attr "type" "fmadd")
+   (set_attr "mode" "<UNITMODE>")])
+\f
+;;
+;;  ....................
+;;
+;;	SQUARE ROOT
+;;
+;;  ....................
+
+(define_insn "sqrt<mode>2"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(sqrt:ANYF (match_operand:ANYF 1 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fsqrt.<fmt>\t%0,%1"
+  [(set_attr "type" "fsqrt")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+
+(define_insn "*rsqrt<mode>a"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
+		  (sqrt:ANYF (match_operand:ANYF 2 "register_operand" "f"))))]
+  "TARGET_HARD_FLOAT && flag_unsafe_math_optimizations"
+  "frsqrt.<fmt>\t%0,%2"
+  [(set_attr "type" "frsqrt")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+
+(define_insn "*rsqrt<mode>b"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(sqrt:ANYF (div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
+			     (match_operand:ANYF 2 "register_operand" "f"))))]
+  "TARGET_HARD_FLOAT && flag_unsafe_math_optimizations"
+  "frsqrt.<fmt>\t%0,%2"
+  [(set_attr "type" "frsqrt")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+\f
+;;
+;;  ....................
+;;
+;;	ABSOLUTE VALUE
+;;
+;;  ....................
+
+(define_insn "abs<mode>2"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(abs:ANYF (match_operand:ANYF 1 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fabs.<fmt>\t%0,%1"
+  [(set_attr "type" "fabs")
+   (set_attr "mode" "<UNITMODE>")])
+\f
+;;
+;;  ...................
+;;
+;;  Count leading zeroes.
+;;
+;;  ...................
+;;
+
+(define_insn "clz<mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(clz:GPR (match_operand:GPR 1 "register_operand" "r")))]
+  ""
+  "clz.<d>\t%0,%1"
+  [(set_attr "type" "clz")
+   (set_attr "mode" "<MODE>")])
+
+;;
+;;  ...................
+;;
+;;  Count trailing zeroes.
+;;
+;;  ...................
+;;
+
+(define_insn "ctz<mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(ctz:GPR (match_operand:GPR 1 "register_operand" "r")))]
+  ""
+  "ctz.<d>\t%0,%1"
+  [(set_attr "type" "clz")
+   (set_attr "mode" "<MODE>")])
+
+;;
+;;  ....................
+;;
+;;	MIN/MAX
+;;
+;;  ....................
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+       (smax:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		  (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fmax.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+       (smin:ANYF (match_operand:ANYF 1 "register_operand" "f")
+		  (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fmin.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smaxa<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+       (if_then_else:ANYF
+	      (gt (abs:ANYF (match_operand:ANYF 1 "register_operand" "f"))
+		  (abs:ANYF (match_operand:ANYF 2 "register_operand" "f")))
+	      (match_dup 1)
+	      (match_dup 2)))]
+  "TARGET_HARD_FLOAT"
+  "fmaxa.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smina<mode>3"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+       (if_then_else:ANYF
+		(lt (abs:ANYF (match_operand:ANYF 1 "register_operand" "f"))
+		    (abs:ANYF (match_operand:ANYF 2 "register_operand" "f")))
+		(match_dup 1)
+		(match_dup 2)))]
+  "TARGET_HARD_FLOAT"
+  "fmina.<fmt>\t%0,%1,%2"
+  [(set_attr "type" "fmove")
+   (set_attr "mode" "<MODE>")])
+\f
+;;
+;;  ....................
+;;
+;;	NEGATION and ONE'S COMPLEMENT
+;;
+;;  ....................
+
+(define_insn "neg<mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(neg:GPR (match_operand:GPR 1 "register_operand" "r")))]
+  ""
+  "sub.<d>\t%0,%.,%1"
+  [(set_attr "alu_type"	"sub")
+   (set_attr "mode"	"<MODE>")])
+
+(define_insn "one_cmpl<mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(not:GPR (match_operand:GPR 1 "register_operand" "r")))]
+  ""
+  "nor\t%0,%.,%1"
+  [(set_attr "alu_type" "not")
+   (set_attr "compression" "*")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "neg<mode>2"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(neg:ANYF (match_operand:ANYF 1 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fneg.<fmt>\t%0,%1"
+  [(set_attr "type" "fneg")
+   (set_attr "mode" "<UNITMODE>")])
+\f
+
+;;
+;;  ....................
+;;
+;;	LOGICAL
+;;
+;;  ....................
+;;
+
+(define_insn "<optab><mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r")
+	(any_bitwise:GPR (match_operand:GPR 1 "register_operand" "r,r")
+		 (match_operand:GPR 2 "uns_arith_operand" "r,K")))]
+  ""
+  "<insn>%i2\t%0,%1,%2"
+  [(set_attr "type" "logical")
+   (set_attr "compression" "*,*")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "and<mode>3_extended"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(and:GPR (match_operand:GPR 1 "nonimmediate_operand" "r")
+		 (match_operand:GPR 2 "low_bitmask_operand" "Yx")))]
+  ""
+{
+  int len;
+
+  len = low_bitmask_len (<MODE>mode, INTVAL (operands[2]));
+  operands[2] = GEN_INT (len-1);
+  return "bstrpick.<d>\t%0,%1,%2,0";
+}
+  [(set_attr "move_type" "pick_ins")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*iorhi3"
+  [(set (match_operand:HI 0 "register_operand" "=r,r")
+	(ior:HI (match_operand:HI 1 "register_operand" "r,r")
+		(match_operand:HI 2 "uns_arith_operand" "r,K")))]
+  ""
+  "or%i2\t%0,%1,%2"
+  [(set_attr "type" "logical")
+   (set_attr "mode" "HI")])
+
+(define_insn "*nor<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(and:GPR (not:GPR (match_operand:GPR 1 "register_operand" "r"))
+		 (not:GPR (match_operand:GPR 2 "register_operand" "r"))))]
+  ""
+  "nor\t%0,%1,%2"
+  [(set_attr "type" "logical")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "andn<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(and:GPR
+	  (not:GPR (match_operand:GPR 1 "register_operand" "r"))
+	  (match_operand:GPR 2 "register_operand" "r")))]
+  ""
+  "andn\t%0,%2,%1"
+  [(set_attr "type" "logical")])
+
+(define_insn "orn<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(ior:GPR
+	  (not:GPR (match_operand:GPR 1 "register_operand" "r"))
+	  (match_operand:GPR 2 "register_operand" "r")))]
+  ""
+  "orn\t%0,%2,%1"
+  [(set_attr "type" "logical")])
+
+\f
+;;
+;;  ....................
+;;
+;;	TRUNCATION
+;;
+;;  ....................
+
+(define_insn "truncdfsf2"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+	(float_truncate:SF (match_operand:DF 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "fcvt.s.d\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "cnv_mode"	"D2S")
+   (set_attr "mode" "SF")])
+
+;; Integer truncation patterns.  Truncating SImode values to smaller
+;; modes is a no-op, as it is for most other GCC ports.  Truncating
+;; DImode values to SImode is not a no-op for TARGET_64BIT since we
+;; need to make sure that the lower 32 bits are properly sign-extended
+;; (see TARGET_TRULY_NOOP_TRUNCATION).  Truncating DImode values into modes
+;; smaller than SImode is equivalent to two separate truncations:
+;;
+;;			  A       B
+;;    DI ---> HI  ==  DI ---> SI ---> HI
+;;    DI ---> QI  ==  DI ---> SI ---> QI
+;;
+;; Step A needs a real instruction but step B does not.
+
+(define_insn "truncdi<mode>2"
+  [(set (match_operand:SUBDI 0 "nonimmediate_operand" "=r,m,k")
+	(truncate:SUBDI (match_operand:DI 1 "register_operand" "r,r,r")))]
+  "TARGET_64BIT"
+  "@
+    slli.w\t%0,%1,0
+    st.<size>\t%1,%0
+    stx.<size>\t%1,%0"
+  [(set_attr "move_type" "sll0,store,store")
+   (set_attr "mode" "SI")])
+
+(define_insn "truncdisi2_extended"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=ZC")
+	(truncate:SI (match_operand:DI 1 "register_operand" "r")))]
+  "TARGET_64BIT"
+  "stptr.w\t%1,%0"
+  [(set_attr "move_type" "store")
+   (set_attr "mode" "SI")])
+
+;; Combiner patterns to optimize shift/truncate combinations.
+
+(define_insn "*ashr_trunc<mode>"
+  [(set (match_operand:SUBDI 0 "register_operand" "=r")
+	(truncate:SUBDI
+	  (ashiftrt:DI (match_operand:DI 1 "register_operand" "r")
+		       (match_operand:DI 2 "const_arith_operand" ""))))]
+  "TARGET_64BIT && IN_RANGE (INTVAL (operands[2]), 32, 63)"
+  "srai.d\t%0,%1,%2"
+  [(set_attr "type" "shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*lshr32_trunc<mode>"
+  [(set (match_operand:SUBDI 0 "register_operand" "=r")
+	(truncate:SUBDI
+	  (lshiftrt:DI (match_operand:DI 1 "register_operand" "r")
+		       (const_int 32))))]
+  "TARGET_64BIT"
+  "srai.d\t%0,%1,32"
+  [(set_attr "type" "shift")
+   (set_attr "mode" "<MODE>")])
+\f
+;;
+;;  ....................
+;;
+;;	ZERO EXTENSION
+;;
+;;  ....................
+
+(define_insn "zero_extendsidi2"
+  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
+	(zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,ZC,m,k")))]
+  "TARGET_64BIT"
+  "@
+   bstrpick.d\t%0,%1,31,0
+   ldptr.w\t%0,%1\n\tlu32i.d\t%0,0
+   ld.wu\t%0,%1
+   ldx.wu\t%0,%1"
+  [(set_attr "move_type" "arith,load,load,load")
+   (set_attr "mode" "DI")
+   (set_attr "insn_count" "1,2,1,1")])
+
+(define_insn "zero_extend<SHORT:mode><GPR:mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r,r")
+	(zero_extend:GPR
+	     (match_operand:SHORT 1 "nonimmediate_operand" "r,m,k")))]
+  ""
+  "@
+   bstrpick.w\t%0,%1,<SHORT:qi_hi>,0
+   ld.<SHORT:size>u\t%0,%1
+   ldx.<SHORT:size>u\t%0,%1"
+  [(set_attr "move_type" "pick_ins,load,load")
+   (set_attr "compression" "*,*,*")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "zero_extendqihi2"
+  [(set (match_operand:HI 0 "register_operand" "=r,r,r")
+	(zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m,k")))]
+  ""
+  "@
+   andi\t%0,%1,0xff
+   ld.bu\t%0,%1
+   ldx.bu\t%0,%1"
+  [(set_attr "move_type" "andi,load,load")
+   (set_attr "mode" "HI")])
+
+;; Combiner patterns to optimize truncate/zero_extend combinations.
+
+(define_insn "*zero_extend<GPR:mode>_trunc<SHORT:mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(zero_extend:GPR
+	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "bstrpick.w\t%0,%1,<SHORT:qi_hi>,0"
+  [(set_attr "move_type" "pick_ins")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*zero_extendhi_truncqi"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+	(zero_extend:HI
+	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "andi\t%0,%1,0xff"
+  [(set_attr "alu_type" "and")
+   (set_attr "mode" "HI")])
+\f
+;;
+;;  ....................
+;;
+;;	SIGN EXTENSION
+;;
+;;  ....................
+
+;; Extension insns.
+;; Those for integer source operand are ordered widest source type first.
+
+;; When TARGET_64BIT, all SImode integer should already be in sign-extended
+;; form (see TARGET_TRULY_NOOP_TRUNCATION and truncdisi2).  We can therefore
+;; get rid of register->register instructions if we constrain the source to
+;; be in the same register as the destination.
+;;
+;; Only the pre-reload scheduler sees the type of the register alternatives;
+;; we split them into nothing before the post-reload scheduler runs.
+;; These alternatives therefore have type "move" in order to reflect
+;; what happens if the two pre-reload operands cannot be tied, and are
+;; instead allocated two separate GPRs.
+(define_insn_and_split "extendsidi2"
+  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
+	(sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "0,ZC,m,k")))]
+  "TARGET_64BIT"
+  "@
+   #
+   ldptr.w\t%0,%1
+   ld.w\t%0,%1
+   ldx.w\t%0,%1"
+  "&& reload_completed && register_operand (operands[1], VOIDmode)"
+  [(const_int 0)]
+{
+  emit_note (NOTE_INSN_DELETED);
+  DONE;
+}
+  [(set_attr "move_type" "move,load,load,load")
+   (set_attr "mode" "DI")])
+
+(define_insn "extend<SHORT:mode><GPR:mode>2"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r,r")
+	(sign_extend:GPR
+	     (match_operand:SHORT 1 "nonimmediate_operand" "r,m,k")))]
+  ""
+  "@
+   ext.w.<SHORT:size>\t%0,%1
+   ld.<SHORT:size>\t%0,%1
+   ldx.<SHORT:size>\t%0,%1"
+  [(set_attr "move_type" "signext,load,load")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "extendqihi2"
+  [(set (match_operand:HI 0 "register_operand" "=r,r,r")
+	(sign_extend:HI
+	     (match_operand:QI 1 "nonimmediate_operand" "r,m,k")))]
+  ""
+  "@
+   ext.w.b\t%0,%1
+   ld.b\t%0,%1
+   ldx.b\t%0,%1"
+  [(set_attr "move_type" "signext,load,load")
+   (set_attr "mode" "SI")])
+
+(define_insn "*extenddi_truncate<mode>"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "ext.w.<size>\t%0,%1"
+  [(set_attr "move_type" "signext")
+   (set_attr "mode" "DI")])
+
+(define_insn "*extendsi_truncate<mode>"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(sign_extend:SI
+	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "ext.w.<size>\t%0,%1"
+  [(set_attr "move_type" "signext")
+   (set_attr "mode" "SI")])
+
+(define_insn "*extendhi_truncateqi"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+	(sign_extend:HI
+	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
+  "TARGET_64BIT"
+  "ext.w.b\t%0,%1"
+  [(set_attr "move_type" "signext")
+   (set_attr "mode" "SI")])
+
+(define_insn "extendsfdf2"
+  [(set (match_operand:DF 0 "register_operand" "=f")
+	(float_extend:DF (match_operand:SF 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "fcvt.d.s\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "cnv_mode"	"S2D")
+   (set_attr "mode" "DF")])
+\f
+;;
+;;  ....................
+;;
+;;	CONVERSIONS
+;;
+;;  ....................
+
+;; conversion of a floating-point value to a integer
+
+(define_insn "fix_truncdfsi2"
+  [(set (match_operand:SI 0 "register_operand" "=f")
+	(fix:SI (match_operand:DF 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ftintrz.w.d %0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "DF")
+   (set_attr "cnv_mode"	"D2I")])
+
+(define_insn "fix_truncsfsi2"
+  [(set (match_operand:SI 0 "register_operand" "=f")
+	(fix:SI (match_operand:SF 1 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "ftintrz.w.s %0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "SF")
+   (set_attr "cnv_mode"	"S2I")])
+
+
+(define_insn "fix_truncdfdi2"
+  [(set (match_operand:DI 0 "register_operand" "=f")
+	(fix:DI (match_operand:DF 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ftintrz.l.d %0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "DF")
+   (set_attr "cnv_mode"	"D2I")])
+
+
+(define_insn "fix_truncsfdi2"
+  [(set (match_operand:DI 0 "register_operand" "=f")
+	(fix:DI (match_operand:SF 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ftintrz.l.s %0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "SF")
+   (set_attr "cnv_mode"	"S2I")])
+
+;; conversion of an integeral (or boolean) value to a floating-point value
+
+(define_insn "floatsidf2"
+  [(set (match_operand:DF 0 "register_operand" "=f")
+	(float:DF (match_operand:SI 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ffint.d.w\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "DF")
+   (set_attr "cnv_mode"	"I2D")])
+
+(define_insn "floatdidf2"
+  [(set (match_operand:DF 0 "register_operand" "=f")
+	(float:DF (match_operand:DI 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ffint.d.l\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "DF")
+   (set_attr "cnv_mode" "I2D")])
+
+(define_insn "floatsisf2"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+	(float:SF (match_operand:SI 1 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "ffint.s.w\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "SF")
+   (set_attr "cnv_mode"	"I2S")])
+
+(define_insn "floatdisf2"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+	(float:SF (match_operand:DI 1 "register_operand" "f")))]
+  "TARGET_DOUBLE_FLOAT"
+  "ffint.s.l\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "SF")
+   (set_attr "cnv_mode"	"I2S")])
+
+;; floating point value by converting to value to an unsigned integer
+
+(define_expand "fixuns_truncdfsi2"
+  [(set (match_operand:SI 0 "register_operand")
+	(unsigned_fix:SI (match_operand:DF 1 "register_operand")))]
+  "TARGET_DOUBLE_FLOAT"
+{
+  rtx reg1 = gen_reg_rtx (DFmode);
+  rtx reg2 = gen_reg_rtx (DFmode);
+  rtx reg3 = gen_reg_rtx (SImode);
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx test;
+  REAL_VALUE_TYPE offset;
+
+  real_2expN (&offset, 31, DFmode);
+
+  if (reg1)		      /* Turn off complaints about unreached code.  */
+    {
+      loongarch_emit_move (reg1,
+			   const_double_from_real_value (offset, DFmode));
+      do_pending_stack_adjust ();
+
+      test = gen_rtx_GE (VOIDmode, operands[1], reg1);
+      emit_jump_insn (gen_cbranchdf4 (test, operands[1], reg1, label1));
+
+      emit_insn (gen_fix_truncdfsi2 (operands[0], operands[1]));
+      emit_jump_insn (gen_rtx_SET (pc_rtx,
+				   gen_rtx_LABEL_REF (VOIDmode, label2)));
+      emit_barrier ();
+
+      emit_label (label1);
+      loongarch_emit_move (reg2, gen_rtx_MINUS (DFmode, operands[1], reg1));
+      loongarch_emit_move (reg3, GEN_INT (trunc_int_for_mode
+				     (BITMASK_HIGH, SImode)));
+
+      emit_insn (gen_fix_truncdfsi2 (operands[0], reg2));
+      emit_insn (gen_iorsi3 (operands[0], operands[0], reg3));
+
+      emit_label (label2);
+
+      /* Allow REG_NOTES to be set on last insn (labels don't have enough
+	 fields, and can't be used for REG_NOTES anyway).  */
+      emit_use (stack_pointer_rtx);
+      DONE;
+    }
+})
+
+(define_expand "fixuns_truncdfdi2"
+  [(set (match_operand:DI 0 "register_operand")
+	(unsigned_fix:DI (match_operand:DF 1 "register_operand")))]
+  "TARGET_DOUBLE_FLOAT"
+{
+  rtx reg1 = gen_reg_rtx (DFmode);
+  rtx reg2 = gen_reg_rtx (DFmode);
+  rtx reg3 = gen_reg_rtx (DImode);
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx test;
+  REAL_VALUE_TYPE offset;
+
+  real_2expN (&offset, 63, DFmode);
+
+  loongarch_emit_move (reg1, const_double_from_real_value (offset, DFmode));
+  do_pending_stack_adjust ();
+
+  test = gen_rtx_GE (VOIDmode, operands[1], reg1);
+  emit_jump_insn (gen_cbranchdf4 (test, operands[1], reg1, label1));
+
+  emit_insn (gen_fix_truncdfdi2 (operands[0], operands[1]));
+  emit_jump_insn (gen_rtx_SET (pc_rtx, gen_rtx_LABEL_REF (VOIDmode, label2)));
+  emit_barrier ();
+
+  emit_label (label1);
+  loongarch_emit_move (reg2, gen_rtx_MINUS (DFmode, operands[1], reg1));
+  loongarch_emit_move (reg3, GEN_INT (BITMASK_HIGH));
+  emit_insn (gen_ashldi3 (reg3, reg3, GEN_INT (32)));
+
+  emit_insn (gen_fix_truncdfdi2 (operands[0], reg2));
+  emit_insn (gen_iordi3 (operands[0], operands[0], reg3));
+
+  emit_label (label2);
+
+  /* Allow REG_NOTES to be set on last insn (labels don't have enough
+     fields, and can't be used for REG_NOTES anyway).  */
+  emit_use (stack_pointer_rtx);
+  DONE;
+})
+
+(define_expand "fixuns_truncsfsi2"
+  [(set (match_operand:SI 0 "register_operand")
+	(unsigned_fix:SI (match_operand:SF 1 "register_operand")))]
+  "TARGET_HARD_FLOAT"
+{
+  rtx reg1 = gen_reg_rtx (SFmode);
+  rtx reg2 = gen_reg_rtx (SFmode);
+  rtx reg3 = gen_reg_rtx (SImode);
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx test;
+  REAL_VALUE_TYPE offset;
+
+  real_2expN (&offset, 31, SFmode);
+
+  loongarch_emit_move (reg1, const_double_from_real_value (offset, SFmode));
+  do_pending_stack_adjust ();
+
+  test = gen_rtx_GE (VOIDmode, operands[1], reg1);
+  emit_jump_insn (gen_cbranchsf4 (test, operands[1], reg1, label1));
+
+  emit_insn (gen_fix_truncsfsi2 (operands[0], operands[1]));
+  emit_jump_insn (gen_rtx_SET (pc_rtx, gen_rtx_LABEL_REF (VOIDmode, label2)));
+  emit_barrier ();
+
+  emit_label (label1);
+  loongarch_emit_move (reg2, gen_rtx_MINUS (SFmode, operands[1], reg1));
+  loongarch_emit_move (reg3, GEN_INT (trunc_int_for_mode
+				 (BITMASK_HIGH, SImode)));
+
+  emit_insn (gen_fix_truncsfsi2 (operands[0], reg2));
+  emit_insn (gen_iorsi3 (operands[0], operands[0], reg3));
+
+  emit_label (label2);
+
+  /* Allow REG_NOTES to be set on last insn (labels don't have enough
+     fields, and can't be used for REG_NOTES anyway).  */
+  emit_use (stack_pointer_rtx);
+  DONE;
+})
+
+(define_expand "fixuns_truncsfdi2"
+  [(set (match_operand:DI 0 "register_operand")
+	(unsigned_fix:DI (match_operand:SF 1 "register_operand")))]
+  "TARGET_DOUBLE_FLOAT"
+{
+  rtx reg1 = gen_reg_rtx (SFmode);
+  rtx reg2 = gen_reg_rtx (SFmode);
+  rtx reg3 = gen_reg_rtx (DImode);
+  rtx_code_label *label1 = gen_label_rtx ();
+  rtx_code_label *label2 = gen_label_rtx ();
+  rtx test;
+  REAL_VALUE_TYPE offset;
+
+  real_2expN (&offset, 63, SFmode);
+
+  loongarch_emit_move (reg1, const_double_from_real_value (offset, SFmode));
+  do_pending_stack_adjust ();
+
+  test = gen_rtx_GE (VOIDmode, operands[1], reg1);
+  emit_jump_insn (gen_cbranchsf4 (test, operands[1], reg1, label1));
+
+  emit_insn (gen_fix_truncsfdi2 (operands[0], operands[1]));
+  emit_jump_insn (gen_rtx_SET (pc_rtx, gen_rtx_LABEL_REF (VOIDmode, label2)));
+  emit_barrier ();
+
+  emit_label (label1);
+  loongarch_emit_move (reg2, gen_rtx_MINUS (SFmode, operands[1], reg1));
+  loongarch_emit_move (reg3, GEN_INT (BITMASK_HIGH));
+  emit_insn (gen_ashldi3 (reg3, reg3, GEN_INT (32)));
+
+  emit_insn (gen_fix_truncsfdi2 (operands[0], reg2));
+  emit_insn (gen_iordi3 (operands[0], operands[0], reg3));
+
+  emit_label (label2);
+
+  /* Allow REG_NOTES to be set on last insn (labels don't have enough
+     fields, and can't be used for REG_NOTES anyway).  */
+  emit_use (stack_pointer_rtx);
+  DONE;
+})
+\f
+;;
+;;  ....................
+;;
+;;	EXTRACT AND INSERT
+;;
+;;  ....................
+
+(define_expand "extzv<mode>"
+  [(set (match_operand:GPR 0 "register_operand")
+	(zero_extract:GPR (match_operand:GPR 1 "register_operand")
+			  (match_operand 2 "const_int_operand")
+			  (match_operand 3 "const_int_operand")))]
+  ""
+{
+  if (!loongarch_use_ins_ext_p (operands[1], INTVAL (operands[2]),
+			   INTVAL (operands[3])))
+    FAIL;
+})
+
+(define_insn "*extzv<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(zero_extract:GPR (match_operand:GPR 1 "register_operand" "r")
+			  (match_operand 2 "const_int_operand" "")
+			  (match_operand 3 "const_int_operand" "")))]
+  "loongarch_use_ins_ext_p (operands[1], INTVAL (operands[2]),
+		       INTVAL (operands[3]))"
+{
+  operands[2] = GEN_INT (INTVAL (operands[2]) + INTVAL (operands[3]) - 1);
+  return "bstrpick.<d>\t%0,%1,%2,%3";
+}
+  [(set_attr "type" "arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "insv<mode>"
+  [(set (zero_extract:GPR (match_operand:GPR 0 "register_operand")
+			  (match_operand 1 "const_int_operand")
+			  (match_operand 2 "const_int_operand"))
+	(match_operand:GPR 3 "reg_or_0_operand"))]
+  ""
+{
+  if (!loongarch_use_ins_ext_p (operands[0], INTVAL (operands[1]),
+			   INTVAL (operands[2])))
+    FAIL;
+})
+
+(define_insn "*insv<mode>"
+  [(set (zero_extract:GPR (match_operand:GPR 0 "register_operand" "+r")
+			  (match_operand:SI 1 "const_int_operand" "")
+			  (match_operand:SI 2 "const_int_operand" ""))
+	(match_operand:GPR 3 "reg_or_0_operand" "rJ"))]
+  "loongarch_use_ins_ext_p (operands[0], INTVAL (operands[1]),
+		       INTVAL (operands[2]))"
+{
+  operands[1] = GEN_INT (INTVAL (operands[1]) + INTVAL (operands[2]) - 1);
+  return "bstrins.<d>\t%0,%z3,%1,%2";
+}
+  [(set_attr "type" "arith")
+   (set_attr "mode" "<MODE>")])
+\f
+;;
+;;  ....................
+;;
+;;	DATA MOVEMENT
+;;
+;;  ....................
+
+;; Allow combine to split complex const_int load sequences, using operand 2
+;; to store the intermediate results.  See move_operand for details.
+(define_split
+  [(set (match_operand:GPR 0 "register_operand")
+	(match_operand:GPR 1 "splittable_const_int_operand"))
+   (clobber (match_operand:GPR 2 "register_operand"))]
+  ""
+  [(const_int 0)]
+{
+  loongarch_move_integer (operands[2], operands[0], INTVAL (operands[1]));
+  DONE;
+})
+
+;; 64-bit integer moves
+
+;; Unlike most other insns, the move insns can't be split with
+;; different predicates, because register spilling and other parts of
+;; the compiler, have memoized the insn number already.
+
+(define_expand "movdi"
+  [(set (match_operand:DI 0 "")
+	(match_operand:DI 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (DImode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movdi_32bit"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,w,*f,*f,*r,*m")
+       (match_operand:DI 1 "move_operand" "r,i,w,r,*J*r,*m,*f,*f"))]
+  "!TARGET_64BIT
+   && (register_operand (operands[0], DImode)
+       || reg_or_0_operand (operands[1], DImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,load,store,mgtf,fpload,mftg,fpstore")
+   (set_attr "mode" "DI")])
+
+(define_insn "*movdi_64bit"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,w,*f,*f,*r,*m")
+	(match_operand:DI 1 "move_operand" "r,Yd,w,rJ,*r*J,*m,*f,*f"))]
+  "TARGET_64BIT
+   && (register_operand (operands[0], DImode)
+       || reg_or_0_operand (operands[1], DImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,load,store,mgtf,fpload,mftg,fpstore")
+   (set_attr "mode" "DI")])
+
+;; 32-bit Integer moves
+
+(define_expand "movsi"
+  [(set (match_operand:SI 0 "")
+	(match_operand:SI 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (SImode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movsi_internal"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,w,*f,*f,*r,*m,*r,*z")
+	(match_operand:SI 1 "move_operand" "r,Yd,w,rJ,*r*J,*m,*f,*f,*z,*r"))]
+  "(register_operand (operands[0], SImode)
+       || reg_or_0_operand (operands[1], SImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,load,store,mgtf,fpload,mftg,fpstore,mftg,mgtf")
+   (set_attr "compression" "all,*,*,*,*,*,*,*,*,*")
+   (set_attr "mode" "SI")])
+
+;; 16-bit Integer moves
+
+;; Unlike most other insns, the move insns can't be split with
+;; different predicates, because register spilling and other parts of
+;; the compiler, have memoized the insn number already.
+;; Unsigned loads are used because LOAD_EXTEND_OP returns ZERO_EXTEND.
+
+(define_expand "movhi"
+  [(set (match_operand:HI 0 "")
+	(match_operand:HI 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (HImode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movhi_internal"
+  [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,r,m,r,k")
+	(match_operand:HI 1 "move_operand" "r,Yd,I,m,rJ,k,rJ"))]
+  "(register_operand (operands[0], HImode)
+       || reg_or_0_operand (operands[1], HImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,const,load,store,load,store")
+   (set_attr "compression" "all,all,*,*,*,*,*")
+   (set_attr "mode" "HI")])
+
+;; 8-bit Integer moves
+
+;; Unlike most other insns, the move insns can't be split with
+;; different predicates, because register spilling and other parts of
+;; the compiler, have memoized the insn number already.
+;; Unsigned loads are used because LOAD_EXTEND_OP returns ZERO_EXTEND.
+
+(define_expand "movqi"
+  [(set (match_operand:QI 0 "")
+	(match_operand:QI 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (QImode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movqi_internal"
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,r,m,r,k")
+	(match_operand:QI 1 "move_operand" "r,I,m,rJ,k,rJ"))]
+  "(register_operand (operands[0], QImode)
+       || reg_or_0_operand (operands[1], QImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,load,store,load,store")
+   (set_attr "compression" "all,*,*,*,*,*")
+   (set_attr "mode" "QI")])
+
+;; 32-bit floating point moves
+
+(define_expand "movsf"
+  [(set (match_operand:SF 0 "")
+	(match_operand:SF 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (SFmode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movsf_hardfloat"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f,m,f,k,m,*f,*r,*r,*r,*m")
+	(match_operand:SF 1 "move_operand" "f,G,m,f,k,f,G,*r,*f,*G*r,*m,*r"))]
+  "TARGET_HARD_FLOAT
+   && (register_operand (operands[0], SFmode)
+       || reg_or_0_operand (operands[1], SFmode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "fmove,mgtf,fpload,fpstore,fpload,fpstore,store,mgtf,mftg,move,load,store")
+   (set_attr "mode" "SF")])
+
+(define_insn "*movsf_softfloat"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,m")
+	(match_operand:SF 1 "move_operand" "Gr,m,r"))]
+  "TARGET_SOFT_FLOAT
+   && (register_operand (operands[0], SFmode)
+       || reg_or_0_operand (operands[1], SFmode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,load,store")
+   (set_attr "mode" "SF")])
+
+;; 64-bit floating point moves
+
+(define_expand "movdf"
+  [(set (match_operand:DF 0 "")
+	(match_operand:DF 1 ""))]
+  ""
+{
+  if (loongarch_legitimize_move (DFmode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movdf_hardfloat"
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,f,f,m,f,k,m,*f,*r,*r,*r,*m")
+	(match_operand:DF 1 "move_operand" "f,G,m,f,k,f,G,*r,*f,*r*G,*m,*r"))]
+  "TARGET_DOUBLE_FLOAT
+   && (register_operand (operands[0], DFmode)
+       || reg_or_0_operand (operands[1], DFmode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "fmove,mgtf,fpload,fpstore,fpload,fpstore,store,mgtf,mftg,move,load,store")
+   (set_attr "mode" "DF")])
+
+(define_insn "*movdf_softfloat"
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=r,r,m")
+	(match_operand:DF 1 "move_operand" "rG,m,rG"))]
+  "(TARGET_SOFT_FLOAT || TARGET_SINGLE_FLOAT)
+   && (register_operand (operands[0], DFmode)
+       || reg_or_0_operand (operands[1], DFmode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,load,store")
+   (set_attr "mode" "DF")])
+
+
+;; 128-bit integer moves
+
+(define_expand "movti"
+  [(set (match_operand:TI 0)
+	(match_operand:TI 1))]
+  "TARGET_64BIT"
+{
+  if (loongarch_legitimize_move (TImode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "*movti"
+  [(set (match_operand:TI 0 "nonimmediate_operand" "=r,r,r,m")
+	(match_operand:TI 1 "move_operand" "r,i,m,rJ"))]
+  "TARGET_64BIT
+   && (register_operand (operands[0], TImode)
+       || reg_or_0_operand (operands[1], TImode))"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "move,const,load,store")
+   (set (attr "mode")
+    (if_then_else (eq_attr "move_type" "imul")
+		      (const_string "SI")
+		      (const_string "TI")))])
+
+;; 128-bit floating point moves
+
+(define_expand "movtf"
+  [(set (match_operand:TF 0)
+	(match_operand:TF 1))]
+  "TARGET_64BIT"
+{
+  if (loongarch_legitimize_move (TFmode, operands[0], operands[1]))
+    DONE;
+})
+
+;; This pattern handles both hard- and soft-float cases.
+(define_insn "*movtf"
+  [(set (match_operand:TF 0 "nonimmediate_operand" "=r,r,m,f,r,f,m")
+	(match_operand:TF 1 "move_operand" "rG,m,rG,rG,f,m,f"))]
+  "TARGET_64BIT
+   && (register_operand (operands[0], TFmode)
+       || reg_or_0_operand (operands[1], TFmode))"
+  "#"
+  [(set_attr "move_type" "move,load,store,mgtf,mftg,fpload,fpstore")
+   (set_attr "mode" "TF")])
+
+(define_split
+  [(set (match_operand:MOVE64 0 "nonimmediate_operand")
+	(match_operand:MOVE64 1 "move_operand"))]
+  "reload_completed && loongarch_split_move_insn_p (operands[0], operands[1], insn)"
+  [(const_int 0)]
+{
+  loongarch_split_move_insn (operands[0], operands[1], curr_insn);
+  DONE;
+})
+
+(define_split
+  [(set (match_operand:MOVE128 0 "nonimmediate_operand")
+	(match_operand:MOVE128 1 "move_operand"))]
+  "reload_completed && loongarch_split_move_insn_p (operands[0], operands[1], insn)"
+  [(const_int 0)]
+{
+  loongarch_split_move_insn (operands[0], operands[1], curr_insn);
+  DONE;
+})
+
+;; Emit a doubleword move in which exactly one of the operands is
+;; a floating-point register.  We can't just emit two normal moves
+;; because of the constraints imposed by the FPU register model;
+;; see loongarch_can_change_mode_class for details.  Instead, we keep
+;; the FPR whole and use special patterns to refer to each word of
+;; the other operand.
+
+(define_expand "move_doubleword_fpr<mode>"
+  [(set (match_operand:SPLITF 0)
+	(match_operand:SPLITF 1))]
+  ""
+{
+  if (FP_REG_RTX_P (operands[0]))
+    {
+      rtx low = loongarch_subword (operands[1], 0);
+      rtx high = loongarch_subword (operands[1], 1);
+      emit_insn (gen_load_low<mode> (operands[0], low));
+      if (!TARGET_64BIT)
+       emit_insn (gen_movgr2frh<mode> (operands[0], high, operands[0]));
+      else
+       emit_insn (gen_load_high<mode> (operands[0], high, operands[0]));
+    }
+  else
+    {
+      rtx low = loongarch_subword (operands[0], 0);
+      rtx high = loongarch_subword (operands[0], 1);
+      emit_insn (gen_store_word<mode> (low, operands[1], const0_rtx));
+      if (!TARGET_64BIT)
+       emit_insn (gen_movfrh2gr<mode> (high, operands[1]));
+      else
+       emit_insn (gen_store_word<mode> (high, operands[1], const1_rtx));
+    }
+  DONE;
+})
+
+;; Conditional move instructions.
+
+(define_insn "*sel<code><GPR:mode>_using_<GPR2:mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r")
+	(if_then_else:GPR
+	 (equality_op:GPR2 (match_operand:GPR2 1 "register_operand" "r,r")
+			   (const_int 0))
+	 (match_operand:GPR 2 "reg_or_0_operand" "r,J")
+	 (match_operand:GPR 3 "reg_or_0_operand" "J,r")))]
+  "register_operand (operands[2], <GPR:MODE>mode)
+       != register_operand (operands[3], <GPR:MODE>mode)"
+  "@
+   <sel>\t%0,%2,%1
+   <selinv>\t%0,%3,%1"
+  [(set_attr "type" "condmove")
+   (set_attr "mode" "<GPR:MODE>")])
+
+;; sel.fmt copies the 3rd argument when the 1st is non-zero and the 2nd
+;; argument if the 1st is zero.  This means operand 2 and 3 are
+;; inverted in the instruction.
+
+(define_insn "*sel<mode>"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(if_then_else:ANYF
+	 (ne:FCC (match_operand:FCC 1 "register_operand" "z")
+		 (const_int 0))
+	 (match_operand:ANYF 2 "reg_or_0_operand" "f")
+	 (match_operand:ANYF 3 "reg_or_0_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fsel\t%0,%3,%2,%1"
+  [(set_attr "type" "condmove")
+   (set_attr "mode" "<ANYF:MODE>")])
+
+;; These are the main define_expand's used to make conditional moves.
+
+(define_expand "mov<mode>cc"
+  [(set (match_operand:GPR 0 "register_operand")
+	(if_then_else:GPR (match_operator 1 "comparison_operator"
+			 [(match_operand:GPR 2 "reg_or_0_operand")
+			  (match_operand:GPR 3 "reg_or_0_operand")])))]
+  "TARGET_COND_MOVE_INT"
+{
+  if (!INTEGRAL_MODE_P (GET_MODE (XEXP (operands[1], 0))))
+    FAIL;
+
+  loongarch_expand_conditional_move (operands);
+  DONE;
+})
+
+(define_expand "mov<mode>cc"
+  [(set (match_operand:ANYF 0 "register_operand")
+	(if_then_else:ANYF (match_operator 1 "comparison_operator"
+			  [(match_operand:ANYF 2 "reg_or_0_operand")
+			   (match_operand:ANYF 3 "reg_or_0_operand")])))]
+  "TARGET_COND_MOVE_FLOAT"
+{
+  if (!FLOAT_MODE_P (GET_MODE (XEXP (operands[1], 0))))
+    FAIL;
+
+  loongarch_expand_conditional_move (operands);
+  DONE;
+})
+
+(define_insn "lu32i_d"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(ior:DI
+	  (zero_extend:DI
+	    (subreg:SI (match_operand:DI 1 "register_operand" "0") 0))
+	  (match_operand:DI 2 "const_lu32i_operand" "u")))]
+  "TARGET_64BIT"
+  "lu32i.d\t%0,%X2>>32"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "DI")])
+
+(define_insn "lu52i_d"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(ior:DI
+	  (and:DI (match_operand:DI 1 "register_operand" "r")
+		  (match_operand 2 "lu52i_mask_operand"))
+	  (match_operand 3 "const_lu52i_operand" "v")))]
+    "TARGET_64BIT"
+    "lu52i.d\t%0,%1,%X3>>52"
+    [(set_attr "type" "arith")
+     (set_attr "mode" "DI")])
+
+;; Convert floating-point numbers to integers
+(define_insn "frint_<fmt>"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
+		      UNSPEC_FRINT))]
+  "TARGET_HARD_FLOAT"
+  "frint.<fmt>\t%0,%1"
+  [(set_attr "type" "fcvt")
+   (set_attr "mode" "<MODE>")])
+
+;; LoongArch supports loading and storing a floating point register from
+;; the sum of two general-purpose registers.  We use two versions for each of
+;; these four instructions: one where the two general-purpose registers are
+;; SImode, and one where they are DImode.  This is because general-purpose
+;; registers will be in SImode when they hold 32-bit values, but,
+;; since the 32-bit values are always sign extended, the f{ld/st}x.{s/d}
+;; instructions will still work correctly.
+
+;; ??? Perhaps it would be better to support these instructions by
+;; modifying TARGET_LEGITIMATE_ADDRESS_P and friends.  However, since
+;; these instructions can only be used to load and store floating
+;; point registers, that would probably cause trouble in reload.
+
+(define_insn "*<ANYF:floadx>_<P:mode>"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+	(mem:ANYF (plus:P (match_operand:P 1 "register_operand" "r")
+			  (match_operand:P 2 "register_operand" "r"))))]
+  "TARGET_HARD_FLOAT"
+  "<ANYF:floadx>\t%0,%1,%2"
+  [(set_attr "type" "fpidxload")
+   (set_attr "mode" "<ANYF:UNITMODE>")])
+
+(define_insn "*<ANYF:fstorex>_<P:mode>"
+  [(set (mem:ANYF (plus:P (match_operand:P 1 "register_operand" "r")
+			  (match_operand:P 2 "register_operand" "r")))
+	(match_operand:ANYF 0 "register_operand" "f"))]
+  "TARGET_HARD_FLOAT"
+  "<ANYF:fstorex>\t%0,%1,%2"
+  [(set_attr "type" "fpidxstore")
+   (set_attr "mode" "<ANYF:UNITMODE>")])
+
+;; loading and storing a integer register from the sum of two general-purpose
+;; registers.
+
+(define_insn "*<GPR:loadx>_<P:mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(mem:GPR
+	    (plus:P (match_operand:P 1 "register_operand" "r")
+		    (match_operand:P 2 "register_operand" "r"))))]
+  ""
+  "<GPR:loadx>\t%0,%1,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*<GPR:storex>_<P:mode>"
+  [(set (mem:GPR (plus:P (match_operand:P 1 "register_operand" "r")
+			 (match_operand:P 2 "register_operand" "r")))
+	(match_operand:GPR 0 "reg_or_0_operand" "rJ"))]
+  ""
+  "<GPR:storex>\t%z0,%1,%2"
+  [(set_attr "type" "store")
+   (set_attr "mode" "<GPR:MODE>")])
+
+;; SHORT mode sign_extend.
+(define_insn "*extend_<SHORT:loadx>_<GPR:mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(sign_extend:GPR
+	  (mem:SHORT
+	    (plus:P (match_operand:P 1 "register_operand" "r")
+		    (match_operand:P 2 "register_operand" "r")))))]
+  ""
+  "<SHORT:loadx>\t%0,%1,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*extend_<SHORT:storex>"
+  [(set (mem:SHORT (plus:P (match_operand:P 1 "register_operand" "r")
+			   (match_operand:P 2 "register_operand" "r")))
+	(match_operand:SHORT 0 "reg_or_0_operand" "rJ"))]
+  ""
+  "<SHORT:storex>\t%z0,%1,%2"
+  [(set_attr "type" "store")
+   (set_attr "mode" "SI")])
+
+;; Load the low word of operand 0 with operand 1.
+(define_insn "load_low<mode>"
+  [(set (match_operand:SPLITF 0 "register_operand" "=f,f")
+	(unspec:SPLITF [(match_operand:<HALFMODE> 1 "general_operand" "rJ,m")]
+		       UNSPEC_LOAD_LOW))]
+  "TARGET_HARD_FLOAT"
+{
+  operands[0] = loongarch_subword (operands[0], 0);
+  return loongarch_output_move (operands[0], operands[1]);
+}
+  [(set_attr "move_type" "mgtf,fpload")
+   (set_attr "mode" "<HALFMODE>")])
+
+;; Load the high word of operand 0 from operand 1, preserving the value
+;; in the low word.
+(define_insn "load_high<mode>"
+  [(set (match_operand:SPLITF 0 "register_operand" "=f,f")
+	(unspec:SPLITF [(match_operand:<HALFMODE> 1 "general_operand" "rJ,m")
+			(match_operand:SPLITF 2 "register_operand" "0,0")]
+		       UNSPEC_LOAD_HIGH))]
+  "TARGET_HARD_FLOAT"
+{
+  operands[0] = loongarch_subword (operands[0], 1);
+  return loongarch_output_move (operands[0], operands[1]);
+}
+  [(set_attr "move_type" "mgtf,fpload")
+   (set_attr "mode" "<HALFMODE>")])
+
+;; Store one word of operand 1 in operand 0.  Operand 2 is 1 to store the
+;; high word and 0 to store the low word.
+(define_insn "store_word<mode>"
+  [(set (match_operand:<HALFMODE> 0 "nonimmediate_operand" "=r,m")
+	(unspec:<HALFMODE> [(match_operand:SPLITF 1 "register_operand" "f,f")
+			    (match_operand 2 "const_int_operand")]
+			   UNSPEC_STORE_WORD))]
+  "TARGET_HARD_FLOAT"
+{
+  operands[1] = loongarch_subword (operands[1], INTVAL (operands[2]));
+  return loongarch_output_move (operands[0], operands[1]);
+}
+  [(set_attr "move_type" "mftg,fpstore")
+   (set_attr "mode" "<HALFMODE>")])
+
+;; Thread-Local Storage
+
+(define_insn "got_load_tls_gd<mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+	(unspec:P
+	    [(match_operand:P 1 "symbolic_operand" "")]
+	    UNSPEC_TLS_GD))]
+  ""
+  "la.tls.gd\t%0,%1"
+  [(set_attr "got" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "got_load_tls_ld<mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+	(unspec:P
+	    [(match_operand:P 1 "symbolic_operand" "")]
+	    UNSPEC_TLS_LD))]
+  ""
+  "la.tls.ld\t%0,%1"
+  [(set_attr "got" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "got_load_tls_le<mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+	(unspec:P
+	    [(match_operand:P 1 "symbolic_operand" "")]
+	    UNSPEC_TLS_LE))]
+  ""
+  "la.tls.le\t%0,%1"
+  [(set_attr "got" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "got_load_tls_ie<mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+	(unspec:P
+	    [(match_operand:P 1 "symbolic_operand" "")]
+	    UNSPEC_TLS_IE))]
+  ""
+  "la.tls.ie\t%0,%1"
+  [(set_attr "got" "load")
+   (set_attr "mode" "<MODE>")])
+
+;; Move operand 1 to the high word of operand 0 using movgr2frh.w, preserving the
+;; value in the low word.
+(define_insn "movgr2frh<mode>"
+  [(set (match_operand:SPLITF 0 "register_operand" "=f")
+	(unspec:SPLITF [(match_operand:<HALFMODE> 1 "reg_or_0_operand" "rJ")
+			(match_operand:SPLITF 2 "register_operand" "0")]
+			UNSPEC_MOVGR2FRH))]
+  "TARGET_DOUBLE_FLOAT"
+  "movgr2frh.w\t%z1,%0"
+  [(set_attr "move_type" "mgtf")
+   (set_attr "mode" "<HALFMODE>")])
+
+;; Move high word of operand 1 to operand 0 using movfrh2gr.s.
+(define_insn "movfrh2gr<mode>"
+  [(set (match_operand:<HALFMODE> 0 "register_operand" "=r")
+	(unspec:<HALFMODE> [(match_operand:SPLITF 1 "register_operand" "f")]
+			    UNSPEC_MOVFRH2GR))]
+  "TARGET_DOUBLE_FLOAT"
+  "movfrh2gr.s\t%0,%1"
+  [(set_attr "move_type" "mftg")
+   (set_attr "mode" "<HALFMODE>")])
+
+\f
+;; Expand in-line code to clear the instruction cache between operand[0] and
+;; operand[1].
+(define_expand "clear_cache"
+  [(match_operand 0 "pmode_register_operand")
+   (match_operand 1 "pmode_register_operand")]
+  ""
+  "
+{
+  emit_insn (gen_ibar (const0_rtx));
+  DONE;
+}")
+
+(define_insn "ibar"
+  [(unspec_volatile:SI [(match_operand 0 "const_uimm15_operand")] UNSPECV_IBAR)]
+  ""
+  "ibar\t%0")
+
+(define_insn "dbar"
+  [(unspec_volatile:SI [(match_operand 0 "const_uimm15_operand")] UNSPECV_DBAR)]
+  ""
+  "dbar\t%0")
+
+\f
+
+;; Privileged state instruction
+
+(define_insn "cpucfg"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec_volatile:SI [(match_operand:SI 1 "register_operand" "r")]
+			     UNSPECV_CPUCFG))]
+  ""
+  "cpucfg\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "SI")])
+
+(define_insn "asrtle_d"
+	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
+			      (match_operand:DI 1 "register_operand" "r")]
+			      UNSPECV_ASRTLE_D)]
+  "TARGET_64BIT"
+  "asrtle.d\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "DI")])
+
+(define_insn "asrtgt_d"
+	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
+			      (match_operand:DI 1 "register_operand" "r")]
+			      UNSPECV_ASRTGT_D)]
+  "TARGET_64BIT"
+  "asrtgt.d\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "DI")])
+
+(define_insn "<p>csrrd"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(unspec_volatile:GPR [(match_operand  1 "const_uimm14_operand")]
+			     UNSPECV_CSRRD))]
+  ""
+  "csrrd\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "<p>csrwr"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	  (unspec_volatile:GPR
+	  [(match_operand:GPR 1 "register_operand" "0")
+	   (match_operand 2 "const_uimm14_operand")]
+	  UNSPECV_CSRWR))]
+  ""
+  "csrwr\t%0,%2"
+  [(set_attr "type" "store")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "<p>csrxchg"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	  (unspec_volatile:GPR
+	  [(match_operand:GPR 1 "register_operand" "0")
+	   (match_operand:GPR 2 "register_operand" "q")
+	   (match_operand 3 "const_uimm14_operand")]
+	  UNSPECV_CSRXCHG))]
+  ""
+  "csrxchg\t%0,%2,%3"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "iocsrrd_<size>"
+  [(set (match_operand:QHWD 0 "register_operand" "=r")
+	(unspec_volatile:QHWD [(match_operand:SI 1 "register_operand" "r")]
+			      UNSPECV_IOCSRRD))]
+  ""
+  "iocsrrd.<size>\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "iocsrwr_<size>"
+  [(unspec_volatile:QHWD [(match_operand:QHWD 0 "register_operand" "r")
+			  (match_operand:SI 1 "register_operand" "r")]
+			UNSPECV_IOCSRWR)]
+  ""
+  "iocsrwr.<size>\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "<p>cacop"
+  [(unspec_volatile:X [(match_operand 0 "const_uimm5_operand")
+			 (match_operand:X 1 "register_operand" "r")
+			 (match_operand 2 "const_imm12_operand")]
+			 UNSPECV_CACOP)]
+  ""
+  "cacop\t%0,%1,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "<p>lddir"
+  [(unspec_volatile:X [(match_operand:X 0 "register_operand" "r")
+			 (match_operand:X 1 "register_operand" "r")
+			 (match_operand 2 "const_uimm5_operand")]
+			 UNSPECV_LDDIR)]
+  ""
+  "lddir\t%0,%1,%2"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "<p>ldpte"
+  [(unspec_volatile:X [(match_operand:X 0 "register_operand" "r")
+			 (match_operand 1 "const_uimm5_operand")]
+			 UNSPECV_LDPTE)]
+  ""
+  "ldpte\t%0,%1"
+  [(set_attr "type" "load")
+   (set_attr "mode" "<MODE>")])
+
+\f
+;; Block moves, see loongarch.c for more details.
+;; Argument 0 is the destination.
+;; Argument 1 is the source.
+;; Argument 2 is the length.
+;; Argument 3 is the alignment.
+
+(define_expand "cpymemsi"
+  [(parallel [(set (match_operand:BLK 0 "general_operand")
+		   (match_operand:BLK 1 "general_operand"))
+	      (use (match_operand:SI 2 ""))
+	      (use (match_operand:SI 3 "const_int_operand"))])]
+  " !TARGET_MEMCPY"
+{
+  if (loongarch_expand_block_move (operands[0], operands[1], operands[2]))
+    DONE;
+  else
+    FAIL;
+})
+\f
+;;
+;;  ....................
+;;
+;;	SHIFTS
+;;
+;;  ....................
+
+(define_insn "<optab><mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(any_shift:GPR (match_operand:GPR 1 "register_operand" "r")
+		       (match_operand:SI 2 "arith_operand" "rI")))]
+  ""
+{
+  if (CONST_INT_P (operands[2]))
+    operands[2] = GEN_INT (INTVAL (operands[2])
+			   & (GET_MODE_BITSIZE (<MODE>mode) - 1));
+
+  return "<insn>%i2.<d>\t%0,%1,%2";
+}
+  [(set_attr "type" "shift")
+   (set_attr "compression" "none")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*<optab>si3_extend"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(sign_extend:DI
+	   (any_shift:SI (match_operand:SI 1 "register_operand" "r")
+			 (match_operand:SI 2 "arith_operand" "rI"))))]
+  "TARGET_64BIT"
+{
+  if (CONST_INT_P (operands[2]))
+    operands[2] = GEN_INT (INTVAL (operands[2]) & 0x1f);
+
+  return "<insn>%i2.w\t%0,%1,%2";
+}
+  [(set_attr "type" "shift")
+   (set_attr "mode" "SI")])
+
+(define_insn "rotr<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r,r")
+	(rotatert:GPR (match_operand:GPR 1 "register_operand" "r,r")
+		      (match_operand:SI 2 "arith_operand" "r,I")))]
+  ""
+  "rotr%i2.<d>\t%0,%1,%2"
+  [(set_attr "type" "shift,shift")
+   (set_attr "mode" "<MODE>")])
+
+;; The following templates were added to generate "bstrpick.d + alsl.d"
+;; instruction pairs.
+;; It is required that the values of const_immalsl_operand and
+;; immediate_operand must have the following correspondence:
+;;
+;; (immediate_operand >> const_immalsl_operand) == 0xffffffff
+
+(define_insn "zero_extend_ashift1"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
+			   (match_operand 2 "const_immalsl_operand" ""))
+		(match_operand 3 "immediate_operand" "")))]
+  "TARGET_64BIT
+   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
+  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "DI")
+   (set_attr "insn_count" "2")])
+
+(define_insn "zero_extend_ashift2"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
+			   (match_operand 2 "const_immalsl_operand" ""))
+		(match_operand 3 "immediate_operand" "")))]
+  "TARGET_64BIT
+   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
+  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "DI")
+   (set_attr "insn_count" "2")])
+
+(define_insn "alsl_paired1"
+  [(set (match_operand:DI 0 "register_operand" "=&r")
+	(plus:DI (and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
+				    (match_operand 2 "const_immalsl_operand" ""))
+			 (match_operand 3 "immediate_operand" ""))
+		 (match_operand:DI 4 "register_operand" "r")))]
+  "TARGET_64BIT
+   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
+  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,%4,%2"
+  [(set_attr "type" "arith")
+  (set_attr "mode" "DI")
+  (set_attr "insn_count" "2")])
+
+(define_insn "alsl_paired2"
+  [(set (match_operand:DI 0 "register_operand" "=&r")
+	(plus:DI (match_operand:DI 1 "register_operand" "r")
+		 (and:DI (ashift:DI (match_operand:DI 2 "register_operand" "r")
+				    (match_operand 3 "const_immalsl_operand" ""))
+			 (match_operand 4 "immediate_operand" ""))))]
+  "TARGET_64BIT
+   && ((INTVAL (operands[4]) >> INTVAL (operands[3])) == 0xffffffff)"
+  "bstrpick.d\t%0,%2,31,0\n\talsl.d\t%0,%0,%1,%3"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "DI")
+   (set_attr "insn_count" "2")])
+
+(define_insn "alsl<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(plus:GPR (ashift:GPR (match_operand:GPR 1 "register_operand" "r")
+			      (match_operand 2 "const_immalsl_operand" ""))
+		  (match_operand:GPR 3 "register_operand" "r")))]
+  ""
+  "alsl.<d>\t%0,%1,%3,%2"
+  [(set_attr "type" "arith")
+   (set_attr "mode" "<MODE>")])
+
+\f
+
+;; Reverse the order of bytes of operand 1 and store the result in operand 0.
+
+(define_insn "bswaphi2"
+  [(set (match_operand:HI 0 "register_operand" "=r")
+	(bswap:HI (match_operand:HI 1 "register_operand" "r")))]
+  ""
+  "revb.2h\t%0,%1"
+  [(set_attr "type" "shift")])
+
+(define_insn_and_split "bswapsi2"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(bswap:SI (match_operand:SI 1 "register_operand" "r")))]
+  ""
+  "#"
+  ""
+  [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_REVB_2H))
+   (set (match_dup 0) (rotatert:SI (match_dup 0) (const_int 16)))]
+  ""
+  [(set_attr "insn_count" "2")])
+
+(define_insn_and_split "bswapdi2"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(bswap:DI (match_operand:DI 1 "register_operand" "r")))]
+  "TARGET_64BIT"
+  "#"
+  ""
+  [(set (match_dup 0) (unspec:DI [(match_dup 1)] UNSPEC_REVB_4H))
+   (set (match_dup 0) (unspec:DI [(match_dup 0)] UNSPEC_REVH_D))]
+  ""
+  [(set_attr "insn_count" "2")])
+
+(define_insn "revb_2h"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:SI 1 "register_operand" "r")] UNSPEC_REVB_2H))]
+  ""
+  "revb.2h\t%0,%1"
+  [(set_attr "type" "shift")])
+
+(define_insn "revb_4h"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "r")] UNSPEC_REVB_4H))]
+  "TARGET_64BIT"
+  "revb.4h\t%0,%1"
+  [(set_attr "type" "shift")])
+
+(define_insn "revh_d"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "r")] UNSPEC_REVH_D))]
+  "TARGET_64BIT"
+  "revh.d\t%0,%1"
+  [(set_attr "type" "shift")])
+\f
+;;
+;;  ....................
+;;
+;;	CONDITIONAL BRANCHES
+;;
+;;  ....................
+
+;; Conditional branches
+
+(define_insn "*branch_fp_FCCmode"
+  [(set (pc)
+	(if_then_else
+	  (match_operator 1 "equality_operator"
+	      [(match_operand:FCC 2 "register_operand" "z")
+		(const_int 0)])
+	  (label_ref (match_operand 0 "" ""))
+	(pc)))]
+  "TARGET_HARD_FLOAT"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					      LARCH_BRANCH ("b%F1", "%Z2%0"),
+					      LARCH_BRANCH ("b%W1", "%Z2%0"));
+}
+  [(set_attr "type" "branch")])
+
+(define_insn "*branch_fp_inverted_FCCmode"
+  [(set (pc)
+	(if_then_else
+	  (match_operator 1 "equality_operator"
+	    [(match_operand:FCC 2 "register_operand" "z")
+	    (const_int 0)])
+	    (pc)
+	  (label_ref (match_operand 0 "" ""))))]
+  "TARGET_HARD_FLOAT"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					      LARCH_BRANCH ("b%W1", "%Z2%0"),
+					      LARCH_BRANCH ("b%F1", "%Z2%0"));
+}
+  [(set_attr "type" "branch")])
+
+;; Conditional branches on ordered comparisons with zero.
+
+(define_insn "*branch_order<mode>"
+  [(set (pc)
+	(if_then_else
+	 (match_operator 1 "order_operator"
+			 [(match_operand:GPR 2 "register_operand" "r,r")
+			  (match_operand:GPR 3 "reg_or_0_operand" "J,r")])
+	 (label_ref (match_operand 0 "" ""))
+	 (pc)))]
+  ""
+  { return loongarch_output_order_conditional_branch (insn, operands, false); }
+  [(set_attr "type" "branch")])
+
+(define_insn "*branch_order<mode>_inverted"
+  [(set (pc)
+	(if_then_else
+	 (match_operator 1 "order_operator"
+			 [(match_operand:GPR 2 "register_operand" "r,r")
+			  (match_operand:GPR 3 "reg_or_0_operand" "J,r")])
+	 (pc)
+	 (label_ref (match_operand 0 "" ""))))]
+  ""
+  { return loongarch_output_order_conditional_branch (insn, operands, true); }
+  [(set_attr "type" "branch")])
+
+;; Conditional branch on equality comparison.
+
+(define_insn "*branch_equality<mode>"
+  [(set (pc)
+	(if_then_else
+	 (match_operator 1 "equality_operator"
+			 [(match_operand:GPR 2 "register_operand" "r")
+			  (match_operand:GPR 3 "reg_or_0_operand" "rJ")])
+	 (label_ref (match_operand 0 "" ""))
+	 (pc)))]
+  ""
+  { return loongarch_output_equal_conditional_branch (insn, operands, false); }
+  [(set_attr "type" "branch")])
+
+
+(define_insn "*branch_equality<mode>_inverted"
+  [(set (pc)
+	(if_then_else
+	 (match_operator 1 "equality_operator"
+			 [(match_operand:GPR 2 "register_operand" "r")
+			  (match_operand:GPR 3 "reg_or_0_operand" "rJ")])
+	 (pc)
+	 (label_ref (match_operand 0 "" ""))))]
+  ""
+  { return loongarch_output_equal_conditional_branch (insn, operands, true); }
+  [(set_attr "type" "branch")])
+
+
+(define_expand "cbranch<mode>4"
+  [(set (pc)
+	(if_then_else (match_operator 0 "comparison_operator"
+		      [(match_operand:GPR 1 "register_operand")
+			(match_operand:GPR 2 "nonmemory_operand")])
+		      (label_ref (match_operand 3 ""))
+		      (pc)))]
+  ""
+{
+  loongarch_expand_conditional_branch (operands);
+  DONE;
+})
+
+(define_expand "cbranch<mode>4"
+  [(set (pc)
+	(if_then_else (match_operator 0 "comparison_operator"
+			[(match_operand:ANYF 1 "register_operand")
+			(match_operand:ANYF 2 "register_operand")])
+		      (label_ref (match_operand 3 ""))
+		      (pc)))]
+  ""
+{
+  loongarch_expand_conditional_branch (operands);
+  DONE;
+})
+
+;; Used to implement built-in functions.
+(define_expand "condjump"
+  [(set (pc)
+	(if_then_else (match_operand 0)
+		      (label_ref (match_operand 1))
+		      (pc)))])
+
+
+\f
+;;
+;;  ....................
+;;
+;;	SETTING A REGISTER FROM A COMPARISON
+;;
+;;  ....................
+
+;; Destination is always set in SI mode.
+
+(define_expand "cstore<mode>4"
+  [(set (match_operand:SI 0 "register_operand")
+	(match_operator:SI 1 "loongarch_cstore_operator"
+	 [(match_operand:GPR 2 "register_operand")
+	  (match_operand:GPR 3 "nonmemory_operand")]))]
+  ""
+{
+  loongarch_expand_scc (operands);
+  DONE;
+})
+
+(define_insn "*seq_zero_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(eq:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		 (const_int 0)))]
+  ""
+  "sltui\t%0,%1,1"
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+
+(define_insn "*sne_zero_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(ne:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		 (const_int 0)))]
+  ""
+  "sltu\t%0,%.,%1"
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*sgt<u>_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(any_gt:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		     (match_operand:GPR 2 "reg_or_0_operand" "rJ")))]
+  ""
+  "slt<u>\t%0,%z2,%1"
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*sge<u>_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(any_ge:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		     (const_int 1)))]
+  ""
+  "slt<u>i\t%0,%.,%1"
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*slt<u>_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(any_lt:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		     (match_operand:GPR 2 "arith_operand" "rI")))]
+  ""
+  "slt<u>%i2\t%0,%1,%2";
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+(define_insn "*sle<u>_<GPR:mode><GPR2:mode>"
+  [(set (match_operand:GPR2 0 "register_operand" "=r")
+	(any_le:GPR2 (match_operand:GPR 1 "register_operand" "r")
+		     (match_operand:GPR 2 "sle_operand" "")))]
+  ""
+{
+  operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
+  return "slt<u>i\t%0,%1,%2";
+}
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<GPR:MODE>")])
+
+\f
+;;
+;;  ....................
+;;
+;;	FLOATING POINT COMPARISONS
+;;
+;;  ....................
+
+(define_insn "s<code>_<ANYF:mode>_using_FCCmode"
+  [(set (match_operand:FCC 0 "register_operand" "=z")
+	(fcond:FCC (match_operand:ANYF 1 "register_operand" "f")
+		    (match_operand:ANYF 2 "register_operand" "f")))]
+  "TARGET_HARD_FLOAT"
+  "fcmp.<fcond>.<fmt>\t%Z0%1,%2"
+  [(set_attr "type" "fcmp")
+   (set_attr "mode" "FCC")])
+
+\f
+;;
+;;  ....................
+;;
+;;	UNCONDITIONAL BRANCHES
+;;
+;;  ....................
+
+;; Unconditional branches.
+
+(define_expand "jump"
+  [(set (pc)
+	(label_ref (match_operand 0)))])
+
+(define_insn "*jump_absolute"
+  [(set (pc)
+	(label_ref (match_operand 0)))]
+  "!flag_pic"
+{
+  return "b\t%l0";
+}
+  [(set_attr "type" "branch")])
+
+(define_insn "*jump_pic"
+  [(set (pc)
+	(label_ref (match_operand 0)))]
+  "flag_pic"
+{
+  return "b\t%0";
+}
+  [(set_attr "type" "branch")])
+
+(define_expand "indirect_jump"
+  [(set (pc) (match_operand 0 "register_operand"))]
+  ""
+{
+  operands[0] = force_reg (Pmode, operands[0]);
+  emit_jump_insn (PMODE_INSN (gen_indirect_jump, (operands[0])));
+  DONE;
+})
+
+(define_insn "indirect_jump_<mode>"
+  [(set (pc) (match_operand:P 0 "register_operand" "r"))]
+  ""
+  {
+    return "jr\t%0";
+  }
+  [(set_attr "type" "jump")
+   (set_attr "mode" "none")])
+
+(define_expand "tablejump"
+  [(set (pc)
+	(match_operand 0 "register_operand"))
+   (use (label_ref (match_operand 1 "")))]
+  ""
+{
+  if (flag_pic)
+      operands[0] = expand_simple_binop (Pmode, PLUS, operands[0],
+					 gen_rtx_LABEL_REF (Pmode,
+							    operands[1]),
+					 NULL_RTX, 0, OPTAB_DIRECT);
+  emit_jump_insn (PMODE_INSN (gen_tablejump, (operands[0], operands[1])));
+  DONE;
+})
+
+(define_insn "tablejump_<mode>"
+  [(set (pc)
+	(match_operand:P 0 "register_operand" "r"))
+   (use (label_ref (match_operand 1 "" "")))]
+  ""
+  {
+    return "jr\t%0";
+  }
+  [(set_attr "type" "jump")
+   (set_attr "mode" "none")])
+
+
+\f
+;;
+;;  ....................
+;;
+;;	Function prologue/epilogue
+;;
+;;  ....................
+;;
+
+(define_expand "prologue"
+  [(const_int 1)]
+  ""
+{
+  loongarch_expand_prologue ();
+  DONE;
+})
+
+;; Block any insns from being moved before this point, since the
+;; profiling call to mcount can use various registers that aren't
+;; saved or used to pass arguments.
+
+(define_insn "blockage"
+  [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
+  ""
+  ""
+  [(set_attr "type" "ghost")
+   (set_attr "mode" "none")])
+
+(define_insn "probe_stack_range_<P:mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+	(unspec_volatile:P [(match_operand:P 1 "register_operand" "0")
+			    (match_operand:P 2 "register_operand" "r")
+			    (match_operand:P 3 "register_operand" "r")]
+			    UNSPECV_PROBE_STACK_RANGE))]
+  ""
+{
+  return loongarch_output_probe_stack_range (operands[0],
+					     operands[2],
+					     operands[3]);
+}
+  [(set_attr "type" "unknown")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "epilogue"
+  [(const_int 2)]
+  ""
+{
+  loongarch_expand_epilogue (false);
+  DONE;
+})
+
+(define_expand "sibcall_epilogue"
+  [(const_int 2)]
+  ""
+{
+  loongarch_expand_epilogue (true);
+  DONE;
+})
+
+;; Trivial return.  Make it look like a normal return insn as that
+;; allows jump optimizations to work better.
+
+(define_expand "return"
+  [(simple_return)]
+  "loongarch_can_use_return_insn ()"
+  { })
+
+(define_expand "simple_return"
+  [(simple_return)]
+  ""
+  { })
+
+(define_insn "*<optab>"
+  [(any_return)]
+  ""
+  {
+    operands[0] = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
+    return "jr\t%0";
+  }
+  [(set_attr "type"	"jump")
+   (set_attr "mode"	"none")])
+
+;; Normal return.
+
+(define_insn "<optab>_internal"
+  [(any_return)
+   (use (match_operand 0 "pmode_register_operand" ""))]
+  ""
+  {
+    return "jr\t%0";
+  }
+  [(set_attr "type"	"jump")
+   (set_attr "mode"	"none")])
+
+;; Exception return.
+(define_insn "loongarch_ertn"
+  [(return)
+   (unspec_volatile [(const_int 0)] UNSPECV_ERTN)]
+  ""
+  "ertn"
+  [(set_attr "type"	"trap")
+   (set_attr "mode"	"none")])
+
+;; This is used in compiling the unwind routines.
+(define_expand "eh_return"
+  [(use (match_operand 0 "general_operand"))]
+  ""
+{
+  if (GET_MODE (operands[0]) != word_mode)
+    operands[0] = convert_to_mode (word_mode, operands[0], 0);
+  if (TARGET_64BIT)
+    emit_insn (gen_eh_set_ra_di (operands[0]));
+  else
+    emit_insn (gen_eh_set_ra_si (operands[0]));
+  DONE;
+})
+
+;; Clobber the return address on the stack.  We can't expand this
+;; until we know where it will be put in the stack frame.
+
+(define_insn "eh_set_ra_si"
+  [(unspec [(match_operand:SI 0 "register_operand" "r")] UNSPEC_EH_RETURN)
+   (clobber (match_scratch:SI 1 "=&r"))]
+  "! TARGET_64BIT"
+  "#")
+
+(define_insn "eh_set_ra_di"
+  [(unspec [(match_operand:DI 0 "register_operand" "r")] UNSPEC_EH_RETURN)
+   (clobber (match_scratch:DI 1 "=&r"))]
+  "TARGET_64BIT"
+  "#")
+
+(define_split
+  [(unspec [(match_operand 0 "register_operand")] UNSPEC_EH_RETURN)
+   (clobber (match_scratch 1))]
+  "reload_completed"
+  [(const_int 0)]
+{
+  loongarch_set_return_address (operands[0], operands[1]);
+  DONE;
+})
+
+
+\f
+;;
+;;  ....................
+;;
+;;	FUNCTION CALLS
+;;
+;;  ....................
+
+;; Sibling calls.  All these patterns use jump instructions.
+
+(define_expand "sibcall"
+  [(parallel [(call (match_operand 0 "")
+		    (match_operand 1 ""))
+	      (use (match_operand 2 ""))	;; next_arg_reg
+	      (use (match_operand 3 ""))])]	;; struct_value_size_rtx
+  ""
+{
+  rtx target = loongarch_legitimize_call_address (XEXP (operands[0], 0));
+
+  emit_call_insn (gen_sibcall_internal (target, operands[1]));
+  DONE;
+})
+
+(define_insn "sibcall_internal"
+  [(call (mem:SI (match_operand 0 "call_insn_operand" "j,c,a,t,h"))
+	 (match_operand 1 "" ""))]
+  "SIBLING_CALL_P (insn)"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "jr\t%0";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,(%%pcrel(%0+0x20000))>>18\n\t"
+	       "jirl\t$r0,$r12,%%pcrel(%0+4)-(%%pcrel(%0+4+0x20000)>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r12,$r13,%0\n\tjr\t$r12";
+      else
+	return "b\t%0";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "b\t%0";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%0\n\tjr\t$r12";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%0\n\tjr\t$r12";
+    case 4:
+      if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return "b\t%%plt(%0)";
+      else if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,(%%plt(%0)+0x20000)>>18\n\t"
+	       "jirl\t$r0,$r12,%%plt(%0)+4-((%%plt(%0)+(4+0x20000))>>18<<18)";
+      else
+	{
+	  sorry ("cmodel extreme and tiny static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")])
+
+(define_expand "sibcall_value"
+  [(parallel [(set (match_operand 0 "")
+		   (call (match_operand 1 "")
+			 (match_operand 2 "")))
+	      (use (match_operand 3 ""))])]		;; next_arg_reg
+  ""
+{
+  rtx target = loongarch_legitimize_call_address (XEXP (operands[1], 0));
+
+ /*  Handle return values created by loongarch_pass_fpr_pair.  */
+  if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 2)
+    {
+      rtx arg1 = XEXP (XVECEXP (operands[0],0, 0), 0);
+      rtx arg2 = XEXP (XVECEXP (operands[0],0, 1), 0);
+
+      emit_call_insn (gen_sibcall_value_multiple_internal (arg1, target,
+							   operands[2],
+							   arg2));
+    }
+   else
+    {
+      /*  Handle return values created by loongarch_return_fpr_single.  */
+      if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 1)
+      operands[0] = XEXP (XVECEXP (operands[0], 0, 0), 0);
+
+      emit_call_insn (gen_sibcall_value_internal (operands[0], target,
+						  operands[2]));
+    }
+  DONE;
+})
+
+(define_insn "sibcall_value_internal"
+  [(set (match_operand 0 "register_operand" "")
+	(call (mem:SI (match_operand 1 "call_insn_operand" "j,c,a,t,h"))
+	      (match_operand 2 "" "")))]
+  "SIBLING_CALL_P (insn)"
+{
+  switch (which_alternative)
+  {
+    case 0:
+      return "jr\t%1";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,%%pcrel(%1+0x20000)>>18\n\t"
+	       "jirl\t$r0,$r12,%%pcrel(%1+4)-((%%pcrel(%1+4+0x20000))>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "b\t%1";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "b\t%1";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%1\n\tjr\t$r12";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%1\n\tjr\t$r12";
+    case 4:
+      if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return " b\t%%plt(%1)";
+      else if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,(%%plt(%1)+0x20000)>>18\n\t"
+	       "jirl\t$r0,$r12,%%plt(%1)+4-((%%plt(%1)+(4+0x20000))>>18<<18)";
+      else
+	{
+	  sorry ("loongarch cmodel extreme and tiny-static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+  }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")])
+
+(define_insn "sibcall_value_multiple_internal"
+  [(set (match_operand 0 "register_operand" "")
+	(call (mem:SI (match_operand 1 "call_insn_operand" "j,c,a,t,h"))
+	      (match_operand 2 "" "")))
+   (set (match_operand 3 "register_operand" "")
+	(call (mem:SI (match_dup 1))
+	      (match_dup 2)))]
+  "SIBLING_CALL_P (insn)"
+{
+  switch (which_alternative)
+  {
+    case 0:
+      return "jr\t%1";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,%%pcrel(%1+0x20000)>>18\n\t"
+	       "jirl\t$r0,$r12,%%pcrel(%1+4)-(%%pcrel(%1+4+0x20000)>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "b\t%1";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "b\t%1";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%1\n\tjr\t$r12";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r12,$r13,%1\n\tjr\t$r12";
+      else
+	return "la.global\t$r12,%1\n\tjr\t$r12";
+    case 4:
+      if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return "b\t%%plt(%1)";
+      else if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r12,(%%plt(%1)+0x20000)>>18\n\t"
+	       "jirl\t$r0,$r12,%%plt(%1)+4-((%%plt(%1)+(4+0x20000))>>18<<18)";
+      else
+	{
+	  sorry ("loongarch cmodel extreme and tiny-static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+  }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")])
+
+(define_expand "call"
+  [(parallel [(call (match_operand 0 "")
+		    (match_operand 1 ""))
+	      (use (match_operand 2 ""))	;; next_arg_reg
+	      (use (match_operand 3 ""))])]	;; struct_value_size_rtx
+  ""
+{
+  rtx target = loongarch_legitimize_call_address (XEXP (operands[0], 0));
+
+  emit_call_insn (gen_call_internal (target, operands[1]));
+  DONE;
+})
+
+(define_insn "call_internal"
+  [(call (mem:SI (match_operand 0 "call_insn_operand" "e,c,a,t,h"))
+	 (match_operand 1 "" ""))
+   (clobber (reg:SI RETURN_ADDR_REGNUM))]
+  ""
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "jirl\t$r1,%0,0";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,%%pcrel(%0+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%pcrel(%0+4)-(%%pcrel(%0+4+0x20000)>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r1,$r12,%0\n\tjirl\t$r1,$r1,0";
+      else
+	return "bl\t%0";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "bl\t%0";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%0\n\tjirl\t$r1,$r1,0";
+      else
+	return "la.global\t$r1,%0\n\tjirl\t$r1,$r1,0";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%0\n\tjirl\t$r1,$r1,0";
+      else
+	return "la.global\t$r1,%0\n\tjirl\t$r1,$r1,0";
+    case 4:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,(%%plt(%0)+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%plt(%0)+4-((%%plt(%0)+(4+0x20000))>>18<<18)";
+      else if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return "bl\t%%plt(%0)";
+      else
+	{
+	  sorry ("cmodel extreme and tiny-static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")
+   (set_attr "insn_count" "1,2,3,3,2")])
+
+(define_expand "call_value"
+  [(parallel [(set (match_operand 0 "")
+		   (call (match_operand 1 "")
+			 (match_operand 2 "")))
+	      (use (match_operand 3 ""))])]		;; next_arg_reg
+  ""
+{
+  rtx target = loongarch_legitimize_call_address (XEXP (operands[1], 0));
+ /*  Handle return values created by loongarch_pass_fpr_pair.  */
+  if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 2)
+    {
+      rtx arg1 = XEXP (XVECEXP (operands[0], 0, 0), 0);
+      rtx arg2 = XEXP (XVECEXP (operands[0], 0, 1), 0);
+
+      emit_call_insn (gen_call_value_multiple_internal (arg1, target,
+							operands[2], arg2));
+    }
+   else
+    {
+      /*  Handle return values created by loongarch_return_fpr_single.  */
+      if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 1)
+      operands[0] = XEXP (XVECEXP (operands[0], 0, 0), 0);
+
+      emit_call_insn (gen_call_value_internal (operands[0], target,
+					       operands[2]));
+    }
+  DONE;
+})
+
+(define_insn "call_value_internal"
+  [(set (match_operand 0 "register_operand" "")
+	(call (mem:SI (match_operand 1 "call_insn_operand" "e,c,a,t,h"))
+	      (match_operand 2 "" "")))
+   (clobber (reg:SI RETURN_ADDR_REGNUM))]
+  ""
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "jirl\t$r1,%1,0";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,%%pcrel(%1+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%pcrel(%1+4)-(%%pcrel(%1+4+0x20000)>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0";
+      else
+	return "bl\t%1";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "bl\t%1";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0";
+      else
+	return "la.global\t$r1,%1\n\tjirl\t$r1,$r1,0";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0";
+      else
+	return "la.global\t$r1,%1\n\tjirl\t$r1,$r1,0";
+    case 4:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,(%%plt(%1)+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%plt(%1)+4-((%%plt(%1)+(4+0x20000))>>18<<18)";
+      else if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return "bl\t%%plt(%1)";
+      else
+	{
+	  sorry ("loongarch cmodel extreme and tiny-static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")
+   (set_attr "insn_count" "1,2,3,3,2")])
+
+(define_insn "call_value_multiple_internal"
+  [(set (match_operand 0 "register_operand" "")
+	(call (mem:SI (match_operand 1 "call_insn_operand" "e,c,a,t,h"))
+	      (match_operand 2 "" "")))
+   (set (match_operand 3 "register_operand" "")
+	(call (mem:SI (match_dup 1))
+	      (match_dup 2)))
+   (clobber (reg:SI RETURN_ADDR_REGNUM))]
+  ""
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "jirl\t$r1,%1,0";
+    case 1:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,%%pcrel(%1+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%pcrel(%1+4)-(%%pcrel(%1+4+0x20000)>>18<<18)";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.local\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0";
+      else
+	return "bl\t%1";
+    case 2:
+      if (TARGET_CMODEL_TINY_STATIC)
+	return "bl\t%1";
+      else if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0 ";
+      else
+	return "la.global\t$r1,%1\n\tjirl\t$r1,$r1,0";
+    case 3:
+      if (TARGET_CMODEL_EXTREME)
+	return "la.global\t$r1,$r12,%1\n\tjirl\t$r1,$r1,0";
+      else
+	return "la.global\t$r1,%1\n\tjirl\t$r1,$r1,0";
+    case 4:
+      if (TARGET_CMODEL_LARGE)
+	return "pcaddu18i\t$r1,(%%plt(%1)+0x20000)>>18\n\t"
+	       "jirl\t$r1,$r1,%%plt(%1)+4-((%%plt(%1)+(4+0x20000))>>18<<18)";
+      else if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
+	return "bl\t%%plt(%1)";
+      else
+	{
+	  sorry ("loongarch cmodel extreme and tiny-static not support plt");
+	  return "";  /* GCC complains about may fall through.  */
+	}
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "jirl" "indirect,direct,direct,direct,direct")
+   (set_attr "insn_count" "1,2,3,3,2")])
+
+
+;; Call subroutine returning any type.
+(define_expand "untyped_call"
+  [(parallel [(call (match_operand 0 "")
+		    (const_int 0))
+	      (match_operand 1 "")
+	      (match_operand 2 "")])]
+  ""
+{
+  int i;
+
+  emit_call_insn (gen_call (operands[0], const0_rtx, NULL, const0_rtx));
+
+  for (i = 0; i < XVECLEN (operands[2], 0); i++)
+    {
+      rtx set = XVECEXP (operands[2], 0, i);
+      loongarch_emit_move (SET_DEST (set), SET_SRC (set));
+    }
+
+  emit_insn (gen_blockage ());
+  DONE;
+})
+\f
+;;
+;;  ....................
+;;
+;;	MISC.
+;;
+;;  ....................
+;;
+
+(define_insn "nop"
+  [(const_int 0)]
+  ""
+  "nop"
+  [(set_attr "type"	"nop")
+   (set_attr "mode"	"none")])
+
+;; __builtin_loongarch_movfcsr2gr: move the FCSR into operand 0.
+(define_insn "loongarch_movfcsr2gr"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+    (unspec_volatile:SI [(match_operand 1 "const_uimm5_operand")]
+    UNSPECV_MOVFCSR2GR))]
+  "TARGET_HARD_FLOAT"
+  "movfcsr2gr\t%0,$r%1")
+
+;; __builtin_loongarch_movgr2fcsr: move operand 0 into the FCSR.
+(define_insn "loongarch_movgr2fcsr"
+  [(unspec_volatile [(match_operand 0 "const_uimm5_operand")
+		     (match_operand:SI 1 "register_operand" "r")]
+	  UNSPECV_MOVGR2FCSR)]
+  "TARGET_HARD_FLOAT"
+  "movgr2fcsr\t$r%0,%1")
+
+(define_insn "fclass_<fmt>"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+       (unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
+		UNSPEC_FCLASS))]
+  "TARGET_HARD_FLOAT"
+  "fclass.<fmt>\t%0,%1"
+  [(set_attr "type" "unknown")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "bytepick_w"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+		   (match_operand:SI 2 "register_operand" "r")
+		   (match_operand:SI 3 "const_0_to_3_operand" "n")]
+	      UNSPEC_BYTEPICK_W))]
+  ""
+  "bytepick.w\t%0,%1,%2,%z3"
+  [(set_attr "mode"    "SI")])
+
+(define_insn "bytepick_d"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "r")
+		   (match_operand:DI 2 "register_operand" "r")
+		   (match_operand:DI 3 "const_0_to_7_operand" "n")]
+	      UNSPEC_BYTEPICK_D))]
+  ""
+  "bytepick.d\t%0,%1,%2,%z3"
+  [(set_attr "mode"    "DI")])
+
+(define_insn "bitrev_4b"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:SI 1 "register_operand" "r")]
+	    UNSPEC_BITREV_4B))]
+  ""
+  "bitrev.4b\t%0,%1"
+  [(set_attr "type"    "unknown")
+   (set_attr "mode"    "SI")])
+
+(define_insn "bitrev_8b"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(match_operand:DI 1 "register_operand" "r")]
+	    UNSPEC_BITREV_8B))]
+  ""
+  "bitrev.8b\t%0,%1"
+  [(set_attr "type"    "unknown")
+   (set_attr "mode"    "DI")])
+
+(define_insn "stack_tie<mode>"
+  [(set (mem:BLK (scratch))
+	(unspec:BLK [(match_operand:GPR 0 "register_operand" "r")
+		     (match_operand:GPR 1 "register_operand" "r")]
+		    UNSPEC_TIE))]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
+(define_insn "gpr_restore_return"
+  [(return)
+   (use (match_operand 0 "pmode_register_operand" ""))
+   (const_int 0)]
+  ""
+  "")
+\f
+(define_split
+  [(match_operand 0 "small_data_pattern")]
+  "reload_completed"
+  [(match_dup 0)]
+  { operands[0] = loongarch_rewrite_small_data (operands[0]); })
+
+
+;; Match paired HI/SI/SF/DFmode load/stores.
+(define_insn "*join2_load_store<JOIN_MODE:mode>"
+  [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand"
+  "=r,f,m,m,r,ZC,r,k,f,k")
+	(match_operand:JOIN_MODE 1 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))
+   (set (match_operand:JOIN_MODE 2 "nonimmediate_operand"
+   "=r,f,m,m,r,ZC,r,k,f,k")
+	(match_operand:JOIN_MODE 3 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))]
+  "reload_completed"
+  {
+    bool load_p = (which_alternative == 0 || which_alternative == 1);
+    /* Reg-renaming pass reuses base register if it is dead after bonded loads.
+       Hardware does not bond those loads, even when they are consecutive.
+       However, order of the loads need to be checked for correctness.  */
+    if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1]))
+      {
+	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
+			 operands);
+	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
+			 &operands[2]);
+      }
+    else
+      {
+	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
+			 &operands[2]);
+	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
+			 operands);
+      }
+    return "";
+  }
+  [(set_attr "move_type"
+  "load,fpload,store,fpstore,load,store,load,store,fpload,fpstore")
+   (set_attr "insn_count" "2,2,2,2,2,2,2,2,2,2")])
+
+;; 2 HI/SI/SF/DF loads are bonded.
+(define_peephole2
+  [(set (match_operand:JOIN_MODE 0 "register_operand")
+	(match_operand:JOIN_MODE 1 "non_volatile_mem_operand"))
+   (set (match_operand:JOIN_MODE 2 "register_operand")
+	(match_operand:JOIN_MODE 3 "non_volatile_mem_operand"))]
+  "loongarch_load_store_bonding_p (operands, <JOIN_MODE:MODE>mode, true)"
+  [(parallel [(set (match_dup 0)
+		   (match_dup 1))
+	      (set (match_dup 2)
+		   (match_dup 3))])]
+  "")
+
+;; 2 HI/SI/SF/DF stores are bonded.
+(define_peephole2
+  [(set (match_operand:JOIN_MODE 0 "memory_operand")
+	(match_operand:JOIN_MODE 1 "register_operand"))
+   (set (match_operand:JOIN_MODE 2 "memory_operand")
+	(match_operand:JOIN_MODE 3 "register_operand"))]
+  "loongarch_load_store_bonding_p (operands, <JOIN_MODE:MODE>mode, false)"
+  [(parallel [(set (match_dup 0)
+		   (match_dup 1))
+	      (set (match_dup 2)
+		   (match_dup 3))])]
+  "")
+
+;; Match paired HImode loads.
+(define_insn "*join2_loadhi"
+  [(set (match_operand:SI 0 "register_operand" "=r,r")
+	(any_extend:SI (match_operand:HI 1 "non_volatile_mem_operand" "m,k")))
+   (set (match_operand:SI 2 "register_operand" "=r,r")
+	(any_extend:SI (match_operand:HI 3 "non_volatile_mem_operand" "m,k")))]
+  "reload_completed"
+  {
+    /* Reg-renaming pass reuses base register if it is dead after bonded loads.
+       Hardware does not bond those loads, even when they are consecutive.
+       However, order of the loads need to be checked for correctness.  */
+    if (!reg_overlap_mentioned_p (operands[0], operands[1]))
+      {
+	output_asm_insn ("ld.h<u>\t%0,%1", operands);
+	output_asm_insn ("ld.h<u>\t%2,%3", operands);
+      }
+    else
+      {
+	output_asm_insn ("ld.h<u>\t%2,%3", operands);
+	output_asm_insn ("ld.h<u>\t%0,%1", operands);
+      }
+
+    return "";
+  }
+  [(set_attr "move_type" "load,load")
+   (set_attr "insn_count" "2,2")])
+
+
+;; 2 HI loads are bonded.
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand")
+	(any_extend:SI (match_operand:HI 1 "non_volatile_mem_operand")))
+   (set (match_operand:SI 2 "register_operand")
+	(any_extend:SI (match_operand:HI 3 "non_volatile_mem_operand")))]
+  "loongarch_load_store_bonding_p (operands, HImode, true)"
+  [(parallel [(set (match_dup 0)
+		   (any_extend:SI (match_dup 1)))
+	      (set (match_dup 2)
+		   (any_extend:SI (match_dup 3)))])]
+  "")
+
+\f
+
+(define_mode_iterator QHSD [QI HI SI DI])
+
+(define_insn "crc_w_<size>_w"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:QHSD 1 "register_operand" "r")
+		   (match_operand:SI 2 "register_operand" "r")]
+		     UNSPEC_CRC))]
+  ""
+  "crc.w.<size>.w\t%0,%1,%2"
+  [(set_attr "type" "unknown")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "crcc_w_<size>_w"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(unspec:SI [(match_operand:QHSD 1 "register_operand" "r")
+		   (match_operand:SI 2 "register_operand" "r")]
+		     UNSPEC_CRCC))]
+  ""
+  "crcc.w.<size>.w\t%0,%1,%2"
+  [(set_attr "type" "unknown")
+   (set_attr "mode" "<MODE>")])
+
+;; Synchronization instructions.
+
+(include "sync.md")
+
+(include "generic.md")
+(include "la464.md")
+
+(define_c_enum "unspec" [
+  UNSPEC_ADDRESS_FIRST
+])
diff --git a/gcc/config/loongarch/predicates.md b/gcc/config/loongarch/predicates.md
new file mode 100644
index 00000000000..ec5f74f4454
--- /dev/null
+++ b/gcc/config/loongarch/predicates.md
@@ -0,0 +1,527 @@
+;; Predicate definitions for LoongArch target.
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+;; Based on MIPS target for GNU compiler.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_predicate "const_uns_arith_operand"
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND_UNSIGNED (INTVAL (op))")))
+
+(define_predicate "uns_arith_operand"
+  (ior (match_operand 0 "const_uns_arith_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "const_lu32i_operand"
+  (and (match_code "const_int")
+       (match_test "LU32I_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_lu52i_operand"
+  (and (match_code "const_int")
+       (match_test "LU52I_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_arith_operand"
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_imm16_operand"
+  (and (match_code "const_int")
+       (match_test "IMM16_OPERAND (INTVAL (op))")))
+
+(define_predicate "arith_operand"
+  (ior (match_operand 0 "const_arith_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "const_immalsl_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 1, 4)")))
+
+(define_predicate "const_uimm3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_uimm4_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 15)")))
+
+(define_predicate "const_uimm5_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 31)")))
+
+(define_predicate "const_uimm6_operand"
+  (and (match_code "const_int")
+       (match_test "UIMM6_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_uimm7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 127)")))
+
+(define_predicate "const_uimm8_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 255)")))
+
+(define_predicate "const_uimm14_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 16383)")))
+
+(define_predicate "const_uimm15_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 32767)")))
+
+(define_predicate "const_imm5_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), -16, 15)")))
+
+(define_predicate "const_imm10_operand"
+  (and (match_code "const_int")
+       (match_test "IMM10_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_imm12_operand"
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND (INTVAL (op))")))
+
+(define_predicate "reg_imm10_operand"
+  (ior (match_operand 0 "const_imm10_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "aq8b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "aq8h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 1)")))
+
+(define_predicate "aq8w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 2)")))
+
+(define_predicate "aq8d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 3)")))
+
+(define_predicate "aq10b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 0)")))
+
+(define_predicate "aq10h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 1)")))
+
+(define_predicate "aq10w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 2)")))
+
+(define_predicate "aq10d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 3)")))
+
+(define_predicate "aq12b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 12, 0)")))
+
+(define_predicate "aq12h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 11, 1)")))
+
+(define_predicate "aq12w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 2)")))
+
+(define_predicate "aq12d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 9, 3)")))
+
+(define_predicate "sle_operand"
+  (and (match_code "const_int")
+       (match_test "IMM12_OPERAND (INTVAL (op) + 1)")))
+
+(define_predicate "sleu_operand"
+  (and (match_operand 0 "sle_operand")
+       (match_test "INTVAL (op) + 1 != 0")))
+
+(define_predicate "const_0_operand"
+  (and (match_code "const_int,const_double,const_vector")
+       (match_test "op == CONST0_RTX (GET_MODE (op))")))
+
+(define_predicate "const_m1_operand"
+  (and (match_code "const_int,const_double,const_vector")
+       (match_test "op == CONSTM1_RTX (GET_MODE (op))")))
+
+(define_predicate "reg_or_m1_operand"
+  (ior (match_operand 0 "const_m1_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "reg_or_0_operand"
+  (ior (match_operand 0 "const_0_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "const_1_operand"
+  (and (match_code "const_int,const_double,const_vector")
+       (match_test "op == CONST1_RTX (GET_MODE (op))")))
+
+(define_predicate "reg_or_1_operand"
+  (ior (match_operand 0 "const_1_operand")
+       (match_operand 0 "register_operand")))
+
+;; These are used in vec_merge, hence accept bitmask as const_int.
+(define_predicate "const_exp_2_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 1)")))
+
+(define_predicate "const_exp_4_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 3)")))
+
+(define_predicate "const_exp_8_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 7)")))
+
+(define_predicate "const_exp_16_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 15)")))
+
+(define_predicate "const_exp_32_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 31)")))
+
+;; This is used for indexing into vectors, and hence only accepts const_int.
+(define_predicate "const_0_or_1_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
+
+(define_predicate "const_2_or_3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 2, 3)")))
+
+(define_predicate "const_0_to_3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 3)")))
+
+(define_predicate "const_0_to_7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_4_to_7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 4, 7)")))
+
+(define_predicate "const_8_to_15_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_16_to_31_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "qi_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xff")))
+
+(define_predicate "hi_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xffff")))
+
+(define_predicate "lu52i_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xfffffffffffff")))
+
+(define_predicate "si_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xffffffff")))
+
+(define_predicate "low_bitmask_operand"
+  (and (match_code "const_int")
+       (match_test "low_bitmask_len (mode, INTVAL (op)) > 12")))
+
+(define_predicate "d_operand"
+  (and (match_code "reg")
+       (match_test "GP_REG_P (REGNO (op))")))
+
+(define_predicate "db4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 4, 0)")))
+
+(define_predicate "db7_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 7, 0)")))
+
+(define_predicate "db8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 8, 0)")))
+
+(define_predicate "ib3_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) - 1, 3, 0)")))
+
+(define_predicate "sb4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 4, 0)")))
+
+(define_predicate "sb5_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 5, 0)")))
+
+(define_predicate "sb8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "sd8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 3)")))
+
+(define_predicate "ub4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 0)")))
+
+(define_predicate "ub8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "uh4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 1)")))
+
+(define_predicate "uw4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 2)")))
+
+(define_predicate "uw5_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 5, 2)")))
+
+(define_predicate "uw6_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 6, 2)")))
+
+(define_predicate "uw8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 8, 2)")))
+
+(define_predicate "addiur2_operand"
+  (and (match_code "const_int")
+	(ior (match_test "INTVAL (op) == -1")
+	     (match_test "INTVAL (op) == 1")
+	     (match_test "INTVAL (op) == 4")
+	     (match_test "INTVAL (op) == 8")
+	     (match_test "INTVAL (op) == 12")
+	     (match_test "INTVAL (op) == 16")
+	     (match_test "INTVAL (op) == 20")
+	     (match_test "INTVAL (op) == 24"))))
+
+(define_predicate "addiusp_operand"
+  (and (match_code "const_int")
+       (ior (match_test "(IN_RANGE (INTVAL (op), 2, 257))")
+	    (match_test "(IN_RANGE (INTVAL (op), -258, -3))"))))
+
+(define_predicate "andi16_operand"
+  (and (match_code "const_int")
+	(ior (match_test "IN_RANGE (INTVAL (op), 1, 4)")
+	     (match_test "IN_RANGE (INTVAL (op), 7, 8)")
+	     (match_test "IN_RANGE (INTVAL (op), 15, 16)")
+	     (match_test "IN_RANGE (INTVAL (op), 31, 32)")
+	     (match_test "IN_RANGE (INTVAL (op), 63, 64)")
+	     (match_test "INTVAL (op) == 255")
+	     (match_test "INTVAL (op) == 32768")
+	     (match_test "INTVAL (op) == 65535"))))
+
+(define_predicate "movep_src_register"
+  (and (match_code "reg")
+       (ior (match_test ("IN_RANGE (REGNO (op), 2, 3)"))
+	    (match_test ("IN_RANGE (REGNO (op), 16, 20)")))))
+
+(define_predicate "movep_src_operand"
+  (ior (match_operand 0 "const_0_operand")
+       (match_operand 0 "movep_src_register")))
+
+(define_predicate "fcc_reload_operand"
+  (and (match_code "reg,subreg")
+       (match_test "FCC_REG_P (true_regnum (op))")))
+
+(define_predicate "muldiv_target_operand"
+		(match_operand 0 "register_operand"))
+
+(define_predicate "const_call_insn_operand"
+  (match_code "const,symbol_ref,label_ref")
+{
+  enum loongarch_symbol_type symbol_type;
+
+  if (!loongarch_symbolic_constant_p (op, SYMBOL_CONTEXT_CALL, &symbol_type))
+    return false;
+
+  switch (symbol_type)
+    {
+    case SYMBOL_GOT_DISP:
+      /* Without explicit relocs, there is no special syntax for
+	 loading the address of a call destination into a register.
+	 Using "la.global JIRL_REGS,foo; jirl JIRL_REGS" would prevent the lazy
+	 binding of "foo", so keep the address of global symbols with the jirl
+	 macro.  */
+      return 1;
+
+    default:
+      return false;
+    }
+})
+
+(define_predicate "call_insn_operand"
+  (ior (match_operand 0 "const_call_insn_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "is_const_call_local_symbol"
+  (and (match_operand 0 "const_call_insn_operand")
+       (ior (match_test "loongarch_global_symbol_p (op) == 0")
+       (match_test "loongarch_symbol_binds_local_p (op) != 0"))
+       (match_test "CONSTANT_P (op)")))
+
+(define_predicate "is_const_call_weak_symbol"
+  (and (match_operand 0 "const_call_insn_operand")
+       (not (match_operand 0 "is_const_call_local_symbol"))
+       (match_test "loongarch_weak_symbol_p (op) != 0")
+       (match_test "CONSTANT_P (op)")))
+
+(define_predicate "is_const_call_plt_symbol"
+  (and (match_operand 0 "const_call_insn_operand")
+       (match_test "flag_plt != 0")
+       (match_test "loongarch_global_symbol_noweak_p (op) != 0")
+       (match_test "CONSTANT_P (op)")))
+
+(define_predicate "is_const_call_global_noplt_symbol"
+  (and (match_operand 0 "const_call_insn_operand")
+       (match_test "flag_plt == 0")
+       (match_test "loongarch_global_symbol_noweak_p (op) != 0")
+       (match_test "CONSTANT_P (op)")))
+
+;; A legitimate CONST_INT operand that takes more than one instruction
+;; to load.
+(define_predicate "splittable_const_int_operand"
+  (match_code "const_int")
+{
+  /* Don't handle multi-word moves this way; we don't want to introduce
+     the individual word-mode moves until after reload.  */
+  if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
+    return false;
+
+  /* Otherwise check whether the constant can be loaded in a single
+     instruction.  */
+  return !LU12I_INT (op) && !IMM12_INT (op) && !IMM12_INT_UNSIGNED (op)
+	 && !LU52I_INT (op);
+})
+
+(define_predicate "move_operand"
+  (match_operand 0 "general_operand")
+{
+  enum loongarch_symbol_type symbol_type;
+
+  /* The thinking here is as follows:
+
+     (1) The move expanders should split complex load sequences into
+	 individual instructions.  Those individual instructions can
+	 then be optimized by all rtl passes.
+
+     (2) The target of pre-reload load sequences should not be used
+	 to store temporary results.  If the target register is only
+	 assigned one value, reload can rematerialize that value
+	 on demand, rather than spill it to the stack.
+
+     (3) If we allowed pre-reload passes like combine and cse to recreate
+	 complex load sequences, we would want to be able to split the
+	 sequences before reload as well, so that the pre-reload scheduler
+	 can see the individual instructions.  This falls foul of (2);
+	 the splitter would be forced to reuse the target register for
+	 intermediate results.
+
+     (4) We want to define complex load splitters for combine.  These
+	 splitters can request a temporary scratch register, which avoids
+	 the problem in (2).  They allow things like:
+
+	      (set (reg T1) (high SYM))
+	      (set (reg T2) (low (reg T1) SYM))
+	      (set (reg X) (plus (reg T2) (const_int OFFSET)))
+
+	 to be combined into:
+
+	      (set (reg T3) (high SYM+OFFSET))
+	      (set (reg X) (lo_sum (reg T3) SYM+OFFSET))
+
+	 if T2 is only used this once.  */
+  switch (GET_CODE (op))
+    {
+    case CONST_INT:
+      return !splittable_const_int_operand (op, mode);
+
+    case CONST:
+    case SYMBOL_REF:
+    case LABEL_REF:
+      return (loongarch_symbolic_constant_p (op, SYMBOL_CONTEXT_LEA,
+					     &symbol_type));
+    default:
+      return true;
+    }
+})
+
+(define_predicate "consttable_operand"
+  (match_test "CONSTANT_P (op)"))
+
+(define_predicate "symbolic_operand"
+  (match_code "const,symbol_ref,label_ref")
+{
+  enum loongarch_symbol_type type;
+  return loongarch_symbolic_constant_p (op, SYMBOL_CONTEXT_LEA, &type);
+})
+
+(define_predicate "got_disp_operand"
+  (match_code "const,symbol_ref,label_ref")
+{
+  enum loongarch_symbol_type type;
+  return (loongarch_symbolic_constant_p (op, SYMBOL_CONTEXT_LEA, &type)
+	  && type == SYMBOL_GOT_DISP);
+})
+
+(define_predicate "symbol_ref_operand"
+  (match_code "symbol_ref"))
+
+(define_predicate "equality_operator"
+  (match_code "eq,ne"))
+
+(define_predicate "extend_operator"
+  (match_code "zero_extend,sign_extend"))
+
+(define_predicate "trap_comparison_operator"
+  (match_code "eq,ne,lt,ltu,ge,geu"))
+
+(define_predicate "order_operator"
+  (match_code "lt,ltu,le,leu,ge,geu,gt,gtu"))
+
+;; For NE, cstore uses sltu instructions in which the first operand is $0.
+
+(define_predicate "loongarch_cstore_operator"
+  (match_code "ne,eq,gt,gtu,ge,geu,lt,ltu,le,leu"))
+
+(define_predicate "small_data_pattern"
+  (and (match_code "set,parallel,unspec,unspec_volatile,prefetch")
+       (match_test "loongarch_small_data_pattern_p (op)")))
+
+(define_predicate "mem_noofs_operand"
+  (and (match_code "mem")
+       (match_code "reg" "0")))
+
+;; Return 1 if the operand is in non-volatile memory.
+(define_predicate "non_volatile_mem_operand"
+  (and (match_operand 0 "memory_operand")
+       (not (match_test "MEM_VOLATILE_P (op)"))))
diff --git a/gcc/config/loongarch/sync.md b/gcc/config/loongarch/sync.md
new file mode 100644
index 00000000000..0c4f1983e88
--- /dev/null
+++ b/gcc/config/loongarch/sync.md
@@ -0,0 +1,574 @@
+;; Machine description for LoongArch atomic operations.
+;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
+;; Contributed by Loongson Ltd.
+;; Based on MIPS and RISC-V target for GNU compiler.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_c_enum "unspec" [
+  UNSPEC_COMPARE_AND_SWAP
+  UNSPEC_COMPARE_AND_SWAP_ADD
+  UNSPEC_COMPARE_AND_SWAP_SUB
+  UNSPEC_COMPARE_AND_SWAP_AND
+  UNSPEC_COMPARE_AND_SWAP_XOR
+  UNSPEC_COMPARE_AND_SWAP_OR
+  UNSPEC_COMPARE_AND_SWAP_NAND
+  UNSPEC_SYNC_OLD_OP
+  UNSPEC_SYNC_EXCHANGE
+  UNSPEC_ATOMIC_STORE
+  UNSPEC_MEMORY_BARRIER
+])
+
+(define_code_iterator any_atomic [plus ior xor and])
+(define_code_attr atomic_optab
+  [(plus "add") (ior "or") (xor "xor") (and "and")])
+
+;; This attribute gives the format suffix for atomic memory operations.
+(define_mode_attr amo [(SI "w") (DI "d")])
+
+;; <amop> expands to the name of the atomic operand that implements a
+;; particular code.
+(define_code_attr amop [(ior "or") (xor "xor") (and "and") (plus "add")])
+
+;; Memory barriers.
+
+(define_expand "mem_thread_fence"
+  [(match_operand:SI 0 "const_int_operand" "")] ;; model
+  ""
+{
+  if (INTVAL (operands[0]) != MEMMODEL_RELAXED)
+    {
+      rtx mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
+      MEM_VOLATILE_P (mem) = 1;
+      emit_insn (gen_mem_thread_fence_1 (mem, operands[0]));
+    }
+  DONE;
+})
+
+;; Until the LoongArch memory model (hence its mapping from C++) is finalized,
+;; conservatively emit a full FENCE.
+(define_insn "mem_thread_fence_1"
+  [(set (match_operand:BLK 0 "" "")
+	(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
+   (match_operand:SI 1 "const_int_operand" "")] ;; model
+  ""
+  "dbar\t0")
+
+;; Atomic memory operations.
+
+;; Implement atomic stores with amoswap.  Fall back to fences for atomic loads.
+(define_insn "atomic_store<mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+ZB")
+    (unspec_volatile:GPR
+      [(match_operand:GPR 1 "reg_or_0_operand" "rJ")
+       (match_operand:SI 2 "const_int_operand")]      ;; model
+      UNSPEC_ATOMIC_STORE))]
+  ""
+  "amswap%A2.<amo>\t$zero,%z1,%0"
+  [(set (attr "length") (const_int 8))])
+
+(define_insn "atomic_<atomic_optab><mode>"
+  [(set (match_operand:GPR 0 "memory_operand" "+ZB")
+	(unspec_volatile:GPR
+	  [(any_atomic:GPR (match_dup 0)
+			   (match_operand:GPR 1 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 2 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+  "am<amop>%A2.<amo>\t$zero,%z1,%0"
+  [(set (attr "length") (const_int 8))])
+
+(define_insn "atomic_fetch_<atomic_optab><mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+	(match_operand:GPR 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR
+	  [(any_atomic:GPR (match_dup 1)
+		     (match_operand:GPR 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+  "am<amop>%A3.<amo>\t%0,%z2,%1"
+  [(set (attr "length") (const_int 8))])
+
+(define_insn "atomic_exchange<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+	(unspec_volatile:GPR
+	  [(match_operand:GPR 1 "memory_operand" "+ZB")
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	  UNSPEC_SYNC_EXCHANGE))
+   (set (match_dup 1)
+	(match_operand:GPR 2 "register_operand" "r"))]
+  ""
+  "amswap%A3.<amo>\t%0,%z2,%1"
+  [(set (attr "length") (const_int 8))])
+
+(define_insn "atomic_cas_value_strong<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")
+			      (match_operand:SI 4 "const_int_operand")  ;; mod_s
+			      (match_operand:SI 5 "const_int_operand")] ;; mod_f
+	 UNSPEC_COMPARE_AND_SWAP))
+   (clobber (match_scratch:GPR 6 "=&r"))]
+  ""
+{
+  return "%G5\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "bne\\t%0,%z2,2f\\n\\t"
+	 "or%i3\\t%6,$zero,%3\\n\\t"
+	 "sc.<amo>\\t%6,%1\\n\\t"
+	 "beq\\t$zero,%6,1b\\n\\t"
+	 "b\\t3f\\n\\t"
+	 "2:\\n\\t"
+	 "dbar\\t0x700\\n\\t"
+	 "3:\\n\\t";
+}
+  [(set (attr "length") (const_int 32))])
+
+(define_expand "atomic_compare_and_swap<mode>"
+  [(match_operand:SI 0 "register_operand" "")   ;; bool output
+   (match_operand:GPR 1 "register_operand" "")  ;; val output
+   (match_operand:GPR 2 "memory_operand" "")    ;; memory
+   (match_operand:GPR 3 "reg_or_0_operand" "")  ;; expected value
+   (match_operand:GPR 4 "reg_or_0_operand" "")  ;; desired value
+   (match_operand:SI 5 "const_int_operand" "")  ;; is_weak
+   (match_operand:SI 6 "const_int_operand" "")  ;; mod_s
+   (match_operand:SI 7 "const_int_operand" "")] ;; mod_f
+  ""
+{
+  emit_insn (gen_atomic_cas_value_strong<mode> (operands[1], operands[2],
+						operands[3], operands[4],
+						operands[6], operands[7]));
+
+  rtx compare = operands[1];
+  if (operands[3] != const0_rtx)
+    {
+      rtx difference = gen_rtx_MINUS (<MODE>mode, operands[1], operands[3]);
+      compare = gen_reg_rtx (<MODE>mode);
+      emit_insn (gen_rtx_SET (compare, difference));
+    }
+
+  if (word_mode != <MODE>mode)
+    {
+      rtx reg = gen_reg_rtx (word_mode);
+      emit_insn (gen_rtx_SET (reg, gen_rtx_SIGN_EXTEND (word_mode, compare)));
+      compare = reg;
+    }
+
+  emit_insn (gen_rtx_SET (operands[0],
+			  gen_rtx_EQ (SImode, compare, const0_rtx)));
+  DONE;
+})
+
+(define_expand "atomic_test_and_set"
+  [(match_operand:QI 0 "register_operand" "")     ;; bool output
+   (match_operand:QI 1 "memory_operand" "+ZB")    ;; memory
+   (match_operand:SI 2 "const_int_operand" "")]   ;; model
+  ""
+{
+  /* We have no QImode atomics, so use the address LSBs to form a mask,
+     then use an aligned SImode atomic.  */
+  rtx result = operands[0];
+  rtx mem = operands[1];
+  rtx model = operands[2];
+  rtx addr = force_reg (Pmode, XEXP (mem, 0));
+  rtx tmp_reg = gen_reg_rtx (Pmode);
+  rtx zero_reg = gen_rtx_REG (Pmode, 0);
+
+  rtx aligned_addr = gen_reg_rtx (Pmode);
+  emit_move_insn (tmp_reg, gen_rtx_PLUS (Pmode, zero_reg, GEN_INT (-4)));
+  emit_move_insn (aligned_addr, gen_rtx_AND (Pmode, addr, tmp_reg));
+
+  rtx aligned_mem = change_address (mem, SImode, aligned_addr);
+  set_mem_alias_set (aligned_mem, 0);
+
+  rtx offset = gen_reg_rtx (SImode);
+  emit_move_insn (offset, gen_rtx_AND (SImode, gen_lowpart (SImode, addr),
+				       GEN_INT (3)));
+
+  rtx tmp = gen_reg_rtx (SImode);
+  emit_move_insn (tmp, GEN_INT (1));
+
+  rtx shmt = gen_reg_rtx (SImode);
+  emit_move_insn (shmt, gen_rtx_ASHIFT (SImode, offset, GEN_INT (3)));
+
+  rtx word = gen_reg_rtx (SImode);
+  emit_move_insn (word, gen_rtx_ASHIFT (SImode, tmp, shmt));
+
+  tmp = gen_reg_rtx (SImode);
+  emit_insn (gen_atomic_fetch_orsi (tmp, aligned_mem, word, model));
+
+  emit_move_insn (gen_lowpart (SImode, result),
+		  gen_rtx_LSHIFTRT (SImode, tmp, shmt));
+  DONE;
+})
+
+(define_insn "atomic_cas_value_cmp_and_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")
+			      (match_operand:SI 6 "const_int_operand")] ;; model
+	 UNSPEC_COMPARE_AND_SWAP))
+   (clobber (match_scratch:GPR 7 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%2\\n\\t"
+	 "bne\\t%7,%z4,2f\\n\\t"
+	 "and\\t%7,%0,%z3\\n\\t"
+	 "or%i5\\t%7,%7,%5\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b\\n\\t"
+	 "b\\t3f\\n\\t"
+	 "2:\\n\\t"
+	 "dbar\\t0x700\\n\\t"
+	 "3:\\n\\t";
+}
+  [(set (attr "length") (const_int 40))])
+
+(define_expand "atomic_compare_and_swap<mode>"
+  [(match_operand:SI 0 "register_operand" "")   ;; bool output
+   (match_operand:SHORT 1 "register_operand" "")  ;; val output
+   (match_operand:SHORT 2 "memory_operand" "")    ;; memory
+   (match_operand:SHORT 3 "reg_or_0_operand" "")  ;; expected value
+   (match_operand:SHORT 4 "reg_or_0_operand" "")  ;; desired value
+   (match_operand:SI 5 "const_int_operand" "")  ;; is_weak
+   (match_operand:SI 6 "const_int_operand" "")  ;; mod_s
+   (match_operand:SI 7 "const_int_operand" "")] ;; mod_f
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_cmp_and_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[1], operands[2],
+				operands[3], operands[4], operands[7]);
+
+  rtx compare = operands[1];
+  if (operands[3] != const0_rtx)
+    {
+      machine_mode mode = GET_MODE (operands[3]);
+      rtx op1 = convert_modes (SImode, mode, operands[1], true);
+      rtx op3 = convert_modes (SImode, mode, operands[3], true);
+      rtx difference = gen_rtx_MINUS (SImode, op1, op3);
+      compare = gen_reg_rtx (SImode);
+      emit_insn (gen_rtx_SET (compare, difference));
+    }
+
+  if (word_mode != <MODE>mode)
+    {
+      rtx reg = gen_reg_rtx (word_mode);
+      emit_insn (gen_rtx_SET (reg, gen_rtx_SIGN_EXTEND (word_mode, compare)));
+      compare = reg;
+    }
+
+  emit_insn (gen_rtx_SET (operands[0],
+			  gen_rtx_EQ (SImode, compare, const0_rtx)));
+  DONE;
+})
+
+(define_insn "atomic_cas_value_add_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_ADD))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "add.w\\t%8,%0,%z5\\n\\t"
+	 "and\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+
+  [(set (attr "length") (const_int 32))])
+
+(define_insn "atomic_cas_value_sub_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_SUB))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "sub.w\\t%8,%0,%z5\\n\\t"
+	 "and\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+  [(set (attr "length") (const_int 32))])
+
+(define_insn "atomic_cas_value_and_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_AND))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "and\\t%8,%0,%z5\\n\\t"
+	 "and\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+  [(set (attr "length") (const_int 32))])
+
+(define_insn "atomic_cas_value_xor_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_XOR))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "xor\\t%8,%0,%z5\\n\\t"
+	 "and\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+
+  [(set (attr "length") (const_int 32))])
+
+(define_insn "atomic_cas_value_or_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_OR))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "or\\t%8,%0,%z5\\n\\t"
+	 "and\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+
+  [(set (attr "length") (const_int 32))])
+
+(define_insn "atomic_cas_value_nand_7_<mode>"
+  [(set (match_operand:GPR 0 "register_operand" "=&r")				;; res
+	(match_operand:GPR 1 "memory_operand" "+ZC"))
+   (set (match_dup 1)
+	(unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ")	;; mask
+			      (match_operand:GPR 3 "reg_or_0_operand" "rJ")	;; inverted_mask
+			      (match_operand:GPR 4 "reg_or_0_operand"  "rJ")	;; old val
+			      (match_operand:GPR 5 "reg_or_0_operand"  "rJ")	;; new val
+			      (match_operand:SI 6 "const_int_operand")]		;; model
+	 UNSPEC_COMPARE_AND_SWAP_NAND))
+   (clobber (match_scratch:GPR 7 "=&r"))
+   (clobber (match_scratch:GPR 8 "=&r"))]
+  ""
+{
+  return "%G6\\n\\t"
+	 "1:\\n\\t"
+	 "ll.<amo>\\t%0,%1\\n\\t"
+	 "and\\t%7,%0,%3\\n\\t"
+	 "and\\t%8,%0,%z5\\n\\t"
+	 "xor\\t%8,%8,%z2\\n\\t"
+	 "or%i8\\t%7,%7,%8\\n\\t"
+	 "sc.<amo>\\t%7,%1\\n\\t"
+	 "beq\\t$zero,%7,1b";
+}
+  [(set (attr "length") (const_int 32))])
+
+(define_expand "atomic_exchange<mode>"
+  [(set (match_operand:SHORT 0 "register_operand")
+	(unspec_volatile:SHORT
+	  [(match_operand:SHORT 1 "memory_operand")
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	  UNSPEC_SYNC_EXCHANGE))
+   (set (match_dup 1)
+	(match_operand:SHORT 2 "register_operand"))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_cmp_and_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_add<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(plus:SHORT (match_dup 1)
+		       (match_operand:SHORT 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_add_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_sub<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(minus:SHORT (match_dup 1)
+			(match_operand:SHORT 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_sub_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_and<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(and:SHORT (match_dup 1)
+		      (match_operand:SHORT 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_and_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_xor<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(xor:SHORT (match_dup 1)
+		      (match_operand:SHORT 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_xor_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_or<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(ior:SHORT (match_dup 1)
+		      (match_operand:SHORT 2 "reg_or_0_operand" "rJ"))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_or_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
+
+(define_expand "atomic_fetch_nand<mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")
+	(match_operand:SHORT 1 "memory_operand" "+ZB"))
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(not:SHORT (and:SHORT (match_dup 1)
+				 (match_operand:SHORT 2 "reg_or_0_operand" "rJ")))
+	   (match_operand:SI 3 "const_int_operand")] ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  ""
+{
+  union loongarch_gen_fn_ptrs generator;
+  generator.fn_7 = gen_atomic_cas_value_nand_7_si;
+  loongarch_expand_atomic_qihi (generator, operands[0], operands[1],
+				operands[1], operands[2], operands[3]);
+  DONE;
+})
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 05/12] LoongArch Port: Machine description C files and .h files.
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (3 preceding siblings ...)
  2022-03-04  7:18 ` [PATCH v8 04/12] LoongArch Port: Machine description files xuchenghua
@ 2022-03-04  7:18 ` xuchenghua
  2022-03-07 18:17   ` Richard Sandiford
  2022-03-04  7:18 ` [PATCH v8 06/12] LoongArch Port: Builtin functions xuchenghua
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

gcc/
	* config/host-linux.cc: Add LoongArch support.
	* config/loongarch/loongarch-protos.h: New file.
	* config/loongarch/loongarch-tune.h: Likewise.
	* config/loongarch/loongarch.cc: Likewise.
	* config/loongarch/loongarch.h: Likewise.
---
 gcc/config/host-linux.cc                |    2 +
 gcc/config/loongarch/loongarch-protos.h |  240 +
 gcc/config/loongarch/loongarch-tune.h   |   72 +
 gcc/config/loongarch/loongarch.cc       | 6450 +++++++++++++++++++++++
 gcc/config/loongarch/loongarch.h        | 1273 +++++
 5 files changed, 8037 insertions(+)
 create mode 100644 gcc/config/loongarch/loongarch-protos.h
 create mode 100644 gcc/config/loongarch/loongarch-tune.h
 create mode 100644 gcc/config/loongarch/loongarch.cc
 create mode 100644 gcc/config/loongarch/loongarch.h

diff --git a/gcc/config/host-linux.cc b/gcc/config/host-linux.cc
index ff4557d5f8d..817d3c0870c 100644
--- a/gcc/config/host-linux.cc
+++ b/gcc/config/host-linux.cc
@@ -98,6 +98,8 @@
 # define TRY_EMPTY_VM_SPACE	0x60000000
 #elif defined(__riscv) && defined (__LP64__)
 # define TRY_EMPTY_VM_SPACE	0x1000000000
+#elif defined(__loongarch__) && defined(__LP64__)
+# define TRY_EMPTY_VM_SPACE	0x8000000000
 #else
 # define TRY_EMPTY_VM_SPACE	0
 #endif
diff --git a/gcc/config/loongarch/loongarch-protos.h b/gcc/config/loongarch/loongarch-protos.h
new file mode 100644
index 00000000000..39a06b87294
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -0,0 +1,240 @@
+/* Prototypes of target machine for GNU compiler.  LoongArch version.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+   Based on MIPS target for GNU compiler.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_LOONGARCH_PROTOS_H
+#define GCC_LOONGARCH_PROTOS_H
+
+/* Describes how a symbol is used.
+
+   SYMBOL_CONTEXT_CALL
+       The symbol is used as the target of a call instruction.
+
+   SYMBOL_CONTEXT_LEA
+       The symbol is used in a load-address operation.
+
+   SYMBOL_CONTEXT_MEM
+       The symbol is used as the address in a MEM.  */
+enum loongarch_symbol_context {
+  SYMBOL_CONTEXT_CALL,
+  SYMBOL_CONTEXT_LEA,
+  SYMBOL_CONTEXT_MEM
+};
+
+/* Classifies a SYMBOL_REF, LABEL_REF or UNSPEC address.
+
+   SYMBOL_GOT_DISP
+       The symbol's value will be loaded directly from the GOT.
+
+   SYMBOL_TLS
+       A thread-local symbol.
+
+   SYMBOL_TLSGD
+   SYMBOL_TLSLDM
+       UNSPEC wrappers around SYMBOL_TLS, corresponding to the
+       thread-local storage relocation operators.
+   */
+enum loongarch_symbol_type {
+  SYMBOL_GOT_DISP,
+  SYMBOL_TLS,
+  SYMBOL_TLSGD,
+  SYMBOL_TLSLDM,
+};
+#define NUM_SYMBOL_TYPES (SYMBOL_TLSLDM + 1)
+
+/* Classifies a type of call.
+
+   LARCH_CALL_NORMAL
+	A normal call or call_value pattern.
+
+   LARCH_CALL_SIBCALL
+	A sibcall or sibcall_value pattern.
+
+   LARCH_CALL_EPILOGUE
+	A call inserted in the epilogue.  */
+enum loongarch_call_type {
+  LARCH_CALL_NORMAL,
+  LARCH_CALL_SIBCALL,
+  LARCH_CALL_EPILOGUE
+};
+
+/* Controls the conditions under which certain instructions are split.
+
+   SPLIT_IF_NECESSARY
+	Only perform splits that are necessary for correctness
+	(because no unsplit version exists).
+
+   SPLIT_FOR_SPEED
+	Perform splits that are necessary for correctness or
+	beneficial for code speed.
+
+   SPLIT_FOR_SIZE
+	Perform splits that are necessary for correctness or
+	beneficial for code size.  */
+enum loongarch_split_type {
+  SPLIT_IF_NECESSARY,
+  SPLIT_FOR_SPEED,
+  SPLIT_FOR_SIZE
+};
+
+extern const char *const loongarch_fp_conditions[16];
+
+/* Routines implemented in loongarch.c.  */
+extern rtx loongarch_emit_move (rtx, rtx);
+extern HOST_WIDE_INT loongarch_initial_elimination_offset (int, int);
+extern void loongarch_expand_prologue (void);
+extern void loongarch_expand_epilogue (bool);
+extern bool loongarch_can_use_return_insn (void);
+\f
+extern bool loongarch_symbolic_constant_p (rtx, enum loongarch_symbol_context,
+					   enum loongarch_symbol_type *);
+extern int loongarch_regno_mode_ok_for_base_p (int, machine_mode, bool);
+extern int loongarch_address_insns (rtx, machine_mode, bool);
+extern int loongarch_const_insns (rtx);
+extern int loongarch_split_const_insns (rtx);
+extern int loongarch_split_128bit_const_insns (rtx);
+extern int loongarch_load_store_insns (rtx, rtx_insn *);
+extern int loongarch_idiv_insns (machine_mode);
+#ifdef RTX_CODE
+extern void loongarch_emit_binary (enum rtx_code, rtx, rtx, rtx);
+#endif
+extern bool loongarch_split_symbol (rtx, rtx, machine_mode, rtx *);
+extern rtx loongarch_unspec_address (rtx, enum loongarch_symbol_type);
+extern rtx loongarch_strip_unspec_address (rtx);
+extern void loongarch_move_integer (rtx, rtx, unsigned HOST_WIDE_INT);
+extern bool loongarch_legitimize_move (machine_mode, rtx, rtx);
+extern rtx loongarch_legitimize_call_address (rtx);
+
+extern rtx loongarch_subword (rtx, bool);
+extern bool loongarch_split_move_p (rtx, rtx, enum loongarch_split_type);
+extern void loongarch_split_move (rtx, rtx, enum loongarch_split_type, rtx);
+extern bool loongarch_split_move_insn_p (rtx, rtx, rtx);
+extern void loongarch_split_move_insn (rtx, rtx, rtx);
+extern const char *loongarch_output_move (rtx, rtx);
+extern bool loongarch_cfun_has_cprestore_slot_p (void);
+#ifdef RTX_CODE
+extern void loongarch_expand_scc (rtx *);
+extern void loongarch_expand_conditional_branch (rtx *);
+extern void loongarch_expand_conditional_move (rtx *);
+extern void loongarch_expand_conditional_trap (rtx);
+#endif
+extern void loongarch_set_return_address (rtx, rtx);
+extern bool loongarch_move_by_pieces_p (unsigned HOST_WIDE_INT, unsigned int);
+extern bool loongarch_expand_block_move (rtx, rtx, rtx);
+
+extern bool loongarch_expand_ext_as_unaligned_load (rtx, rtx, HOST_WIDE_INT,
+						    HOST_WIDE_INT, bool);
+extern bool loongarch_expand_ins_as_unaligned_store (rtx, rtx, HOST_WIDE_INT,
+						     HOST_WIDE_INT);
+extern HOST_WIDE_INT loongarch_debugger_offset (rtx, HOST_WIDE_INT);
+
+extern void loongarch_output_external (FILE *, tree, const char *);
+extern void loongarch_output_ascii (FILE *, const char *, size_t);
+extern void loongarch_output_aligned_decl_common (FILE *, tree, const char *,
+						  unsigned HOST_WIDE_INT,
+						  unsigned int);
+extern void loongarch_declare_common_object (FILE *, const char *,
+					     const char *,
+					     unsigned HOST_WIDE_INT,
+					     unsigned int, bool);
+extern void loongarch_declare_object (FILE *, const char *, const char *,
+				      const char *, ...) ATTRIBUTE_PRINTF_4;
+extern void loongarch_declare_object_name (FILE *, const char *, tree);
+extern void loongarch_finish_declare_object (FILE *, tree, int, int);
+extern void loongarch_set_text_contents_type (FILE *, const char *,
+					      unsigned long, bool);
+
+extern bool loongarch_small_data_pattern_p (rtx);
+extern rtx loongarch_rewrite_small_data (rtx);
+extern rtx loongarch_return_addr (int, rtx);
+
+extern enum reg_class loongarch_secondary_reload_class (enum reg_class,
+							machine_mode,
+							rtx, bool);
+extern int loongarch_class_max_nregs (enum reg_class, machine_mode);
+
+extern machine_mode loongarch_hard_regno_caller_save_mode (unsigned int,
+							   unsigned int,
+							   machine_mode);
+extern int loongarch_adjust_insn_length (rtx_insn *, int);
+extern const char *loongarch_output_conditional_branch (rtx_insn *, rtx *,
+							const char *,
+							const char *);
+extern const char *loongarch_output_order_conditional_branch (rtx_insn *,
+							      rtx *,
+							      bool);
+extern const char *loongarch_output_equal_conditional_branch (rtx_insn *,
+							      rtx *,
+							      bool);
+extern const char *loongarch_output_division (const char *, rtx *);
+extern const char *loongarch_output_probe_stack_range (rtx, rtx, rtx);
+extern bool loongarch_hard_regno_rename_ok (unsigned int, unsigned int);
+extern int loongarch_dspalu_bypass_p (rtx, rtx);
+extern rtx loongarch_prefetch_cookie (rtx, rtx);
+
+extern bool loongarch_global_symbol_p (const_rtx);
+extern bool loongarch_global_symbol_noweak_p (const_rtx);
+extern bool loongarch_weak_symbol_p (const_rtx);
+extern bool loongarch_symbol_binds_local_p (const_rtx);
+
+extern const char *current_section_name (void);
+extern unsigned int current_section_flags (void);
+extern bool loongarch_use_ins_ext_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
+
+union loongarch_gen_fn_ptrs
+{
+  rtx (*fn_8) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  rtx (*fn_7) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  rtx (*fn_6) (rtx, rtx, rtx, rtx, rtx, rtx);
+  rtx (*fn_5) (rtx, rtx, rtx, rtx, rtx);
+  rtx (*fn_4) (rtx, rtx, rtx, rtx);
+};
+
+extern void loongarch_expand_atomic_qihi (union loongarch_gen_fn_ptrs,
+					  rtx, rtx, rtx, rtx, rtx);
+
+extern bool loongarch_signed_immediate_p (unsigned HOST_WIDE_INT, int, int);
+extern bool loongarch_unsigned_immediate_p (unsigned HOST_WIDE_INT, int, int);
+extern bool loongarch_12bit_offset_address_p (rtx, machine_mode);
+extern bool loongarch_14bit_shifted_offset_address_p (rtx, machine_mode);
+extern rtx loongarch_expand_thread_pointer (rtx);
+
+extern bool loongarch_eh_uses (unsigned int);
+extern bool loongarch_epilogue_uses (unsigned int);
+extern bool loongarch_load_store_bonding_p (rtx *, machine_mode, bool);
+extern bool loongarch_split_symbol_type (enum loongarch_symbol_type);
+
+typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx);
+
+extern void loongarch_register_frame_header_opt (void);
+
+extern void loongarch_declare_function_name (FILE *, const char *, tree);
+
+/* Routines implemented in loongarch-c.c.  */
+void loongarch_cpu_cpp_builtins (cpp_reader *);
+
+extern void loongarch_init_builtins (void);
+extern void loongarch_atomic_assign_expand_fenv (tree *, tree *, tree *);
+extern tree loongarch_builtin_decl (unsigned int, bool);
+extern rtx loongarch_expand_builtin (tree, rtx, rtx subtarget ATTRIBUTE_UNUSED,
+				     machine_mode, int);
+extern tree loongarch_build_builtin_va_list (void);
+
+#endif /* ! GCC_LOONGARCH_PROTOS_H */
diff --git a/gcc/config/loongarch/loongarch-tune.h b/gcc/config/loongarch/loongarch-tune.h
new file mode 100644
index 00000000000..7067c2228bf
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-tune.h
@@ -0,0 +1,72 @@
+/* Definitions for microarchitecture-related data structures.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef LOONGARCH_TUNE_H
+#define LOONGARCH_TUNE_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* RTX costs of various operations on the different architectures.  */
+struct loongarch_rtx_cost_data
+{
+  unsigned short fp_add;
+  unsigned short fp_mult_sf;
+  unsigned short fp_mult_df;
+  unsigned short fp_div_sf;
+  unsigned short fp_div_df;
+  unsigned short int_mult_si;
+  unsigned short int_mult_di;
+  unsigned short int_div_si;
+  unsigned short int_div_di;
+  unsigned short branch_cost;
+  unsigned short memory_latency;
+};
+
+/* Default RTX cost initializer.  */
+#define COSTS_N_INSNS(N) ((N) * 4)
+#define DEFAULT_COSTS				\
+    .fp_add		= COSTS_N_INSNS (1),	\
+    .fp_mult_sf		= COSTS_N_INSNS (2),	\
+    .fp_mult_df		= COSTS_N_INSNS (2),	\
+    .fp_div_sf		= COSTS_N_INSNS (4),	\
+    .fp_div_df		= COSTS_N_INSNS (4),	\
+    .int_mult_si	= COSTS_N_INSNS (1),	\
+    .int_mult_di	= COSTS_N_INSNS (1),	\
+    .int_div_si		= COSTS_N_INSNS (1),	\
+    .int_div_di		= COSTS_N_INSNS (1),	\
+    .branch_cost	= 2,			\
+    .memory_latency	= 4
+
+/* Costs to use when optimizing for size.  */
+extern const struct loongarch_rtx_cost_data loongarch_rtx_cost_optimize_size;
+
+/* Cache size record of known processor models.  */
+struct loongarch_cache {
+    int l1d_line_size;  /* bytes */
+    int l1d_size;       /* KiB */
+    int l2d_size;       /* kiB */
+};
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* LOONGARCH_TUNE_H */
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
new file mode 100644
index 00000000000..7cc69e3f1d2
--- /dev/null
+++ b/gcc/config/loongarch/loongarch.cc
@@ -0,0 +1,6450 @@
+/* Subroutines used for LoongArch code generation.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+   Based on MIPS and RISC-V target for GNU compiler.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "rtl.h"
+#include "tree.h"
+#include "memmodel.h"
+#include "gimple.h"
+#include "cfghooks.h"
+#include "df.h"
+#include "tm_p.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "optabs.h"
+#include "regs.h"
+#include "emit-rtl.h"
+#include "recog.h"
+#include "cgraph.h"
+#include "diagnostic.h"
+#include "insn-attr.h"
+#include "output.h"
+#include "alias.h"
+#include "fold-const.h"
+#include "varasm.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "explow.h"
+#include "expr.h"
+#include "libfuncs.h"
+#include "reload.h"
+#include "common/common-target.h"
+#include "langhooks.h"
+#include "cfgrtl.h"
+#include "cfganal.h"
+#include "sched-int.h"
+#include "gimplify.h"
+#include "target-globals.h"
+#include "tree-pass.h"
+#include "context.h"
+#include "builtins.h"
+#include "rtl-iter.h"
+
+/* This file should be included last.  */
+#include "target-def.h"
+
+/* True if X is an UNSPEC wrapper around a SYMBOL_REF or LABEL_REF.  */
+#define UNSPEC_ADDRESS_P(X)					\
+  (GET_CODE (X) == UNSPEC					\
+   && XINT (X, 1) >= UNSPEC_ADDRESS_FIRST			\
+   && XINT (X, 1) < UNSPEC_ADDRESS_FIRST + NUM_SYMBOL_TYPES)
+
+/* Extract the symbol or label from UNSPEC wrapper X.  */
+#define UNSPEC_ADDRESS(X) XVECEXP (X, 0, 0)
+
+/* Extract the symbol type from UNSPEC wrapper X.  */
+#define UNSPEC_ADDRESS_TYPE(X) \
+  ((enum loongarch_symbol_type) (XINT (X, 1) - UNSPEC_ADDRESS_FIRST))
+
+/* True if INSN is a loongarch.md pattern or asm statement.  */
+/* ???	This test exists through the compiler, perhaps it should be
+   moved to rtl.h.  */
+#define USEFUL_INSN_P(INSN)						\
+  (NONDEBUG_INSN_P (INSN)						\
+   && GET_CODE (PATTERN (INSN)) != USE					\
+   && GET_CODE (PATTERN (INSN)) != CLOBBER)
+
+/* True if bit BIT is set in VALUE.  */
+#define BITSET_P(VALUE, BIT) (((VALUE) & (1 << (BIT))) != 0)
+
+/* Classifies an address.
+
+   ADDRESS_REG
+       A natural register + offset address.  The register satisfies
+       loongarch_valid_base_register_p and the offset is a const_arith_operand.
+
+   ADDRESS_REG_REG
+       A base register indexed by (optionally scaled) register.
+
+   ADDRESS_CONST_INT
+       A signed 16-bit constant address.
+
+   ADDRESS_SYMBOLIC:
+       A constant symbolic address.  */
+enum loongarch_address_type
+{
+  ADDRESS_REG,
+  ADDRESS_REG_REG,
+  ADDRESS_CONST_INT,
+  ADDRESS_SYMBOLIC
+};
+
+
+/* Information about an address described by loongarch_address_type.
+
+   ADDRESS_CONST_INT
+       No fields are used.
+
+   ADDRESS_REG
+       REG is the base register and OFFSET is the constant offset.
+
+   ADDRESS_SYMBOLIC
+       SYMBOL_TYPE is the type of symbol that the address references.  */
+struct loongarch_address_info
+{
+  enum loongarch_address_type type;
+  rtx reg;
+  rtx offset;
+  enum loongarch_symbol_type symbol_type;
+};
+
+/* Method of loading instant numbers:
+
+   METHOD_NORMAL:
+     Load 0-31 bit of the immediate number.
+
+   METHOD_LU32I:
+     Load 32-51 bit of the immediate number.
+
+   METHOD_LU52I:
+     Load 52-63 bit of the immediate number.
+
+   METHOD_INSV:
+     immediate like 0xfff00000fffffxxx
+   */
+enum loongarch_load_imm_method
+{
+  METHOD_NORMAL,
+  METHOD_LU32I,
+  METHOD_LU52I,
+  METHOD_INSV
+};
+
+struct loongarch_integer_op
+{
+  enum rtx_code code;
+  unsigned HOST_WIDE_INT value;
+  enum loongarch_load_imm_method method;
+};
+
+/* The largest number of operations needed to load an integer constant.
+   The worst accepted case for 64-bit constants is LU12I.W,LU32I.D,LU52I.D,ORI
+   or LU12I.W,LU32I.D,LU52I.D,ADDI.D DECL_ASSEMBLER_NAME.  */
+#define LARCH_MAX_INTEGER_OPS 4
+
+/* Arrays that map GCC register numbers to debugger register numbers.  */
+int loongarch_dwarf_regno[FIRST_PSEUDO_REGISTER];
+
+/* Index [M][R] is true if register R is allowed to hold a value of mode M.  */
+static bool loongarch_hard_regno_mode_ok_p[MAX_MACHINE_MODE]
+					  [FIRST_PSEUDO_REGISTER];
+
+/* Index C is true if character C is a valid PRINT_OPERAND punctation
+   character.  */
+static bool loongarch_print_operand_punct[256];
+
+/* Cached value of can_issue_more.  This is cached in loongarch_variable_issue
+   hook and returned from loongarch_sched_reorder2.  */
+static int cached_can_issue_more;
+
+/* Index R is the smallest register class that contains register R.  */
+const enum reg_class loongarch_regno_to_class[FIRST_PSEUDO_REGISTER] = {
+    GR_REGS,	     GR_REGS,	      GR_REGS,	       GR_REGS,
+    JIRL_REGS,       JIRL_REGS,       JIRL_REGS,       JIRL_REGS,
+    JIRL_REGS,       JIRL_REGS,       JIRL_REGS,       JIRL_REGS,
+    SIBCALL_REGS,    SIBCALL_REGS,    SIBCALL_REGS,    SIBCALL_REGS,
+    SIBCALL_REGS,    SIBCALL_REGS,    SIBCALL_REGS,    SIBCALL_REGS,
+    SIBCALL_REGS,    GR_REGS,	      GR_REGS,	       JIRL_REGS,
+    JIRL_REGS,       JIRL_REGS,       JIRL_REGS,       JIRL_REGS,
+    JIRL_REGS,       JIRL_REGS,       JIRL_REGS,       JIRL_REGS,
+
+    FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
+    FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
+    FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
+    FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
+    FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
+    FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
+    FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
+    FP_REGS,	FP_REGS,	FP_REGS,	FP_REGS,
+    FCC_REGS,	FCC_REGS,	FCC_REGS,	FCC_REGS,
+    FCC_REGS,	FCC_REGS,	FCC_REGS,	FCC_REGS,
+    FRAME_REGS,	FRAME_REGS
+};
+
+/* The value of TARGET_ATTRIBUTE_TABLE.  */
+static const struct attribute_spec loongarch_attribute_table[] = {
+    /* { name, min_len, max_len, decl_req, type_req, fn_type_req,
+       affects_type_identity, handler, exclude }  */
+      { NULL, 0, 0, false, false, false, false, NULL, NULL }
+};
+
+/* Which cost information to use.  */
+static const struct loongarch_rtx_cost_data *loongarch_cost;
+
+/* Information about a single argument.  */
+struct loongarch_arg_info
+{
+  /* True if the argument is at least partially passed on the stack.  */
+  bool stack_p;
+
+  /* The number of integer registers allocated to this argument.  */
+  unsigned int num_gprs;
+
+  /* The offset of the first register used, provided num_gprs is nonzero.
+     If passed entirely on the stack, the value is MAX_ARGS_IN_REGISTERS.  */
+  unsigned int gpr_offset;
+
+  /* The number of floating-point registers allocated to this argument.  */
+  unsigned int num_fprs;
+
+  /* The offset of the first register used, provided num_fprs is nonzero.  */
+  unsigned int fpr_offset;
+};
+
+/* Implement TARGET_FUNCTION_ARG_BOUNDARY.  Every parameter gets at
+   least PARM_BOUNDARY bits of alignment, but will be given anything up
+   to PREFERRED_STACK_BOUNDARY bits if the type requires it.  */
+
+static unsigned int
+loongarch_function_arg_boundary (machine_mode mode, const_tree type)
+{
+  unsigned int alignment;
+
+  /* Use natural alignment if the type is not aggregate data.  */
+  if (type && !AGGREGATE_TYPE_P (type))
+    alignment = TYPE_ALIGN (TYPE_MAIN_VARIANT (type));
+  else
+    alignment = type ? TYPE_ALIGN (type) : GET_MODE_ALIGNMENT (mode);
+
+  return MIN (PREFERRED_STACK_BOUNDARY, MAX (PARM_BOUNDARY, alignment));
+}
+
+/* If MODE represents an argument that can be passed or returned in
+   floating-point registers, return the number of registers, else 0.  */
+
+static unsigned
+loongarch_pass_mode_in_fpr_p (machine_mode mode)
+{
+  if (GET_MODE_UNIT_SIZE (mode) <= UNITS_PER_FP_ARG)
+    {
+      if (GET_MODE_CLASS (mode) == MODE_FLOAT)
+	return 1;
+
+      if (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT)
+	return 2;
+    }
+
+  return 0;
+}
+
+typedef struct
+{
+  const_tree type;
+  HOST_WIDE_INT offset;
+} loongarch_aggregate_field;
+
+/* Identify subfields of aggregates that are candidates for passing in
+   floating-point registers.  */
+
+static int
+loongarch_flatten_aggregate_field (const_tree type,
+				   loongarch_aggregate_field fields[2], int n,
+				   HOST_WIDE_INT offset)
+{
+  switch (TREE_CODE (type))
+    {
+    case RECORD_TYPE:
+      /* Can't handle incomplete types nor sizes that are not fixed.  */
+      if (!COMPLETE_TYPE_P (type)
+	  || TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST
+	  || !tree_fits_uhwi_p (TYPE_SIZE (type)))
+	return -1;
+
+      for (tree f = TYPE_FIELDS (type); f; f = DECL_CHAIN (f))
+	if (TREE_CODE (f) == FIELD_DECL)
+	  {
+	    if (!TYPE_P (TREE_TYPE (f)))
+	      return -1;
+
+	    HOST_WIDE_INT pos = offset + int_byte_position (f);
+	    n = loongarch_flatten_aggregate_field (TREE_TYPE (f), fields, n,
+						   pos);
+	    if (n < 0)
+	      return -1;
+	  }
+      return n;
+
+    case ARRAY_TYPE:
+      {
+	HOST_WIDE_INT n_elts;
+	loongarch_aggregate_field subfields[2];
+	tree index = TYPE_DOMAIN (type);
+	tree elt_size = TYPE_SIZE_UNIT (TREE_TYPE (type));
+	int n_subfields = loongarch_flatten_aggregate_field (TREE_TYPE (type),
+							     subfields, 0,
+							     offset);
+
+	/* Can't handle incomplete types nor sizes that are not fixed.  */
+	if (n_subfields <= 0
+	    || !COMPLETE_TYPE_P (type)
+	    || TREE_CODE (TYPE_SIZE (type)) != INTEGER_CST
+	    || !index
+	    || !TYPE_MAX_VALUE (index)
+	    || !tree_fits_uhwi_p (TYPE_MAX_VALUE (index))
+	    || !TYPE_MIN_VALUE (index)
+	    || !tree_fits_uhwi_p (TYPE_MIN_VALUE (index))
+	    || !tree_fits_uhwi_p (elt_size))
+	  return -1;
+
+	n_elts = 1 + tree_to_uhwi (TYPE_MAX_VALUE (index))
+		 - tree_to_uhwi (TYPE_MIN_VALUE (index));
+	gcc_assert (n_elts >= 0);
+
+	for (HOST_WIDE_INT i = 0; i < n_elts; i++)
+	  for (int j = 0; j < n_subfields; j++)
+	    {
+	      if (n >= 2)
+		return -1;
+
+	      fields[n] = subfields[j];
+	      fields[n++].offset += i * tree_to_uhwi (elt_size);
+	    }
+
+	return n;
+      }
+
+    case COMPLEX_TYPE:
+      {
+	/* Complex type need consume 2 field, so n must be 0.  */
+	if (n != 0)
+	  return -1;
+
+	HOST_WIDE_INT elt_size = GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (type)));
+
+	if (elt_size <= UNITS_PER_FP_ARG)
+	  {
+	    fields[0].type = TREE_TYPE (type);
+	    fields[0].offset = offset;
+	    fields[1].type = TREE_TYPE (type);
+	    fields[1].offset = offset + elt_size;
+
+	    return 2;
+	  }
+
+	return -1;
+      }
+
+    default:
+      if (n < 2
+	  && ((SCALAR_FLOAT_TYPE_P (type)
+	       && GET_MODE_SIZE (TYPE_MODE (type)) <= UNITS_PER_FP_ARG)
+	      || (INTEGRAL_TYPE_P (type)
+		  && GET_MODE_SIZE (TYPE_MODE (type)) <= UNITS_PER_WORD)))
+	{
+	  fields[n].type = type;
+	  fields[n].offset = offset;
+	  return n + 1;
+	}
+      else
+	return -1;
+    }
+}
+
+/* Identify candidate aggregates for passing in floating-point registers.
+   Candidates have at most two fields after flattening.  */
+
+static int
+loongarch_flatten_aggregate_argument (const_tree type,
+				      loongarch_aggregate_field fields[2])
+{
+  if (!type || TREE_CODE (type) != RECORD_TYPE)
+    return -1;
+
+  return loongarch_flatten_aggregate_field (type, fields, 0, 0);
+}
+
+/* See whether TYPE is a record whose fields should be returned in one or
+   two floating-point registers.  If so, populate FIELDS accordingly.  */
+
+static unsigned
+loongarch_pass_aggregate_in_fpr_pair_p (const_tree type,
+					loongarch_aggregate_field fields[2])
+{
+  int n = loongarch_flatten_aggregate_argument (type, fields);
+
+  for (int i = 0; i < n; i++)
+    if (!SCALAR_FLOAT_TYPE_P (fields[i].type))
+      return 0;
+
+  return n > 0 ? n : 0;
+}
+
+/* See whether TYPE is a record whose fields should be returned in one or
+   floating-point register and one integer register.  If so, populate
+   FIELDS accordingly.  */
+
+static bool
+loongarch_pass_aggregate_in_fpr_and_gpr_p (const_tree type,
+					   loongarch_aggregate_field fields[2])
+{
+  unsigned num_int = 0, num_float = 0;
+  int n = loongarch_flatten_aggregate_argument (type, fields);
+
+  for (int i = 0; i < n; i++)
+    {
+      num_float += SCALAR_FLOAT_TYPE_P (fields[i].type);
+      num_int += INTEGRAL_TYPE_P (fields[i].type);
+    }
+
+  return num_int == 1 && num_float == 1;
+}
+
+/* Return the representation of an argument passed or returned in an FPR
+   when the value has mode VALUE_MODE and the type has TYPE_MODE.  The
+   two modes may be different for structures like:
+
+   struct __attribute__((packed)) foo { float f; }
+
+   where the SFmode value "f" is passed in REGNO but the struct itself
+   has mode BLKmode.  */
+
+static rtx
+loongarch_pass_fpr_single (machine_mode type_mode, unsigned regno,
+			   machine_mode value_mode)
+{
+  rtx x = gen_rtx_REG (value_mode, regno);
+
+  if (type_mode != value_mode)
+    {
+      x = gen_rtx_EXPR_LIST (VOIDmode, x, const0_rtx);
+      x = gen_rtx_PARALLEL (type_mode, gen_rtvec (1, x));
+    }
+  return x;
+}
+
+/* Pass or return a composite value in the FPR pair REGNO and REGNO + 1.
+   MODE is the mode of the composite.  MODE1 and OFFSET1 are the mode and
+   byte offset for the first value, likewise MODE2 and OFFSET2 for the
+   second value.  */
+
+static rtx
+loongarch_pass_fpr_pair (machine_mode mode, unsigned regno1,
+			 machine_mode mode1, HOST_WIDE_INT offset1,
+			 unsigned regno2, machine_mode mode2,
+			 HOST_WIDE_INT offset2)
+{
+  return gen_rtx_PARALLEL (
+    mode, gen_rtvec (2,
+		     gen_rtx_EXPR_LIST (VOIDmode, gen_rtx_REG (mode1, regno1),
+					GEN_INT (offset1)),
+		     gen_rtx_EXPR_LIST (VOIDmode, gen_rtx_REG (mode2, regno2),
+					GEN_INT (offset2))));
+}
+
+/* Fill INFO with information about a single argument, and return an
+   RTL pattern to pass or return the argument.  CUM is the cumulative
+   state for earlier arguments.  MODE is the mode of this argument and
+   TYPE is its type (if known).  NAMED is true if this is a named
+   (fixed) argument rather than a variable one.  RETURN_P is true if
+   returning the argument, or false if passing the argument.  */
+
+static rtx
+loongarch_get_arg_info (struct loongarch_arg_info *info,
+			const CUMULATIVE_ARGS *cum, machine_mode mode,
+			const_tree type, bool named, bool return_p)
+{
+  unsigned num_bytes, num_words;
+  unsigned fpr_base = return_p ? FP_RETURN : FP_ARG_FIRST;
+  unsigned gpr_base = return_p ? GP_RETURN : GP_ARG_FIRST;
+  unsigned alignment = loongarch_function_arg_boundary (mode, type);
+
+  memset (info, 0, sizeof (*info));
+  info->gpr_offset = cum->num_gprs;
+  info->fpr_offset = cum->num_fprs;
+
+  if (named)
+    {
+      loongarch_aggregate_field fields[2];
+      unsigned fregno = fpr_base + info->fpr_offset;
+      unsigned gregno = gpr_base + info->gpr_offset;
+
+      /* Pass one- or two-element floating-point aggregates in FPRs.  */
+      if ((info->num_fprs
+	   = loongarch_pass_aggregate_in_fpr_pair_p (type, fields))
+	  && info->fpr_offset + info->num_fprs <= MAX_ARGS_IN_REGISTERS)
+	switch (info->num_fprs)
+	  {
+	  case 1:
+	    return loongarch_pass_fpr_single (mode, fregno,
+					      TYPE_MODE (fields[0].type));
+
+	  case 2:
+	    return loongarch_pass_fpr_pair (mode, fregno,
+					    TYPE_MODE (fields[0].type),
+					    fields[0].offset,
+					    fregno + 1,
+					    TYPE_MODE (fields[1].type),
+					    fields[1].offset);
+
+	  default:
+	    gcc_unreachable ();
+	  }
+
+      /* Pass real and complex floating-point numbers in FPRs.  */
+      if ((info->num_fprs = loongarch_pass_mode_in_fpr_p (mode))
+	  && info->fpr_offset + info->num_fprs <= MAX_ARGS_IN_REGISTERS)
+	switch (GET_MODE_CLASS (mode))
+	  {
+	  case MODE_FLOAT:
+	    return gen_rtx_REG (mode, fregno);
+
+	  case MODE_COMPLEX_FLOAT:
+	    return loongarch_pass_fpr_pair (mode, fregno,
+					    GET_MODE_INNER (mode), 0,
+					    fregno + 1, GET_MODE_INNER (mode),
+					    GET_MODE_UNIT_SIZE (mode));
+
+	  default:
+	    gcc_unreachable ();
+	  }
+
+      /* Pass structs with one float and one integer in an FPR and a GPR.  */
+      if (loongarch_pass_aggregate_in_fpr_and_gpr_p (type, fields)
+	  && info->gpr_offset < MAX_ARGS_IN_REGISTERS
+	  && info->fpr_offset < MAX_ARGS_IN_REGISTERS)
+	{
+	  info->num_gprs = 1;
+	  info->num_fprs = 1;
+
+	  if (!SCALAR_FLOAT_TYPE_P (fields[0].type))
+	    std::swap (fregno, gregno);
+
+	  return loongarch_pass_fpr_pair (mode, fregno,
+					  TYPE_MODE (fields[0].type),
+					  fields[0].offset, gregno,
+					  TYPE_MODE (fields[1].type),
+					  fields[1].offset);
+	}
+    }
+
+  /* Work out the size of the argument.  */
+  num_bytes = type ? int_size_in_bytes (type) : GET_MODE_SIZE (mode);
+  num_words = (num_bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+
+  /* Doubleword-aligned varargs start on an even register boundary.  */
+  if (!named && num_bytes != 0 && alignment > BITS_PER_WORD)
+    info->gpr_offset += info->gpr_offset & 1;
+
+  /* Partition the argument between registers and stack.  */
+  info->num_fprs = 0;
+  info->num_gprs = MIN (num_words, MAX_ARGS_IN_REGISTERS - info->gpr_offset);
+  info->stack_p = (num_words - info->num_gprs) != 0;
+
+  if (info->num_gprs || return_p)
+    return gen_rtx_REG (mode, gpr_base + info->gpr_offset);
+
+  return NULL_RTX;
+}
+
+/* Implement TARGET_FUNCTION_ARG.  */
+
+static rtx
+loongarch_function_arg (cumulative_args_t cum_v, const function_arg_info &arg)
+{
+  CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
+  struct loongarch_arg_info info;
+
+  if (arg.end_marker_p ())
+    return NULL;
+
+  return loongarch_get_arg_info (&info, cum, arg.mode, arg.type, arg.named,
+				 false);
+}
+
+/* Implement TARGET_FUNCTION_ARG_ADVANCE.  */
+
+static void
+loongarch_function_arg_advance (cumulative_args_t cum_v,
+				const function_arg_info &arg)
+{
+  CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
+  struct loongarch_arg_info info;
+
+  loongarch_get_arg_info (&info, cum, arg.mode, arg.type, arg.named, false);
+
+  /* Advance the register count.  This has the effect of setting
+     num_gprs to MAX_ARGS_IN_REGISTERS if a doubleword-aligned
+     argument required us to skip the final GPR and pass the whole
+     argument on the stack.  */
+  cum->num_fprs = info.fpr_offset + info.num_fprs;
+  cum->num_gprs = info.gpr_offset + info.num_gprs;
+}
+
+/* Implement TARGET_ARG_PARTIAL_BYTES.  */
+
+static int
+loongarch_arg_partial_bytes (cumulative_args_t cum,
+			     const function_arg_info &generic_arg)
+{
+  struct loongarch_arg_info arg;
+
+  loongarch_get_arg_info (&arg, get_cumulative_args (cum), generic_arg.mode,
+			  generic_arg.type, generic_arg.named, false);
+  return arg.stack_p ? arg.num_gprs * UNITS_PER_WORD : 0;
+}
+
+/* Implement FUNCTION_VALUE and LIBCALL_VALUE.  For normal calls,
+   VALTYPE is the return type and MODE is VOIDmode.  For libcalls,
+   VALTYPE is null and MODE is the mode of the return value.  */
+
+static rtx
+loongarch_function_value_1 (const_tree type, const_tree func,
+			    machine_mode mode)
+{
+  struct loongarch_arg_info info;
+  CUMULATIVE_ARGS args;
+
+  if (type)
+    {
+      int unsigned_p = TYPE_UNSIGNED (type);
+
+      mode = TYPE_MODE (type);
+
+      /* Since TARGET_PROMOTE_FUNCTION_MODE unconditionally promotes,
+	 return values, promote the mode here too.  */
+      mode = promote_function_mode (type, mode, &unsigned_p, func, 1);
+    }
+
+  memset (&args, 0, sizeof (args));
+  return loongarch_get_arg_info (&info, &args, mode, type, true, true);
+}
+
+
+/* Implement TARGET_FUNCTION_VALUE.  */
+
+static rtx
+loongarch_function_value (const_tree valtype, const_tree fn_decl_or_type,
+			  bool outgoing ATTRIBUTE_UNUSED)
+{
+  return loongarch_function_value_1 (valtype, fn_decl_or_type, VOIDmode);
+}
+
+/* Implement TARGET_LIBCALL_VALUE.  */
+
+static rtx
+loongarch_libcall_value (machine_mode mode, const_rtx fun ATTRIBUTE_UNUSED)
+{
+  return loongarch_function_value_1 (NULL_TREE, NULL_TREE, mode);
+}
+
+
+/* Implement TARGET_PASS_BY_REFERENCE.  */
+
+static bool
+loongarch_pass_by_reference (cumulative_args_t cum_v,
+			     const function_arg_info &arg)
+{
+  HOST_WIDE_INT size = arg.type_size_in_bytes ();
+  struct loongarch_arg_info info;
+  CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v);
+
+  /* ??? std_gimplify_va_arg_expr passes NULL for cum.  Fortunately, we
+     never pass variadic arguments in floating-point registers, so we can
+     avoid the call to loongarch_get_arg_info in this case.  */
+  if (cum != NULL)
+    {
+      /* Don't pass by reference if we can use a floating-point register.  */
+      loongarch_get_arg_info (&info, cum, arg.mode, arg.type, arg.named,
+			      false);
+      if (info.num_fprs)
+	return false;
+    }
+
+  /* Pass by reference if the data do not fit in two integer registers.  */
+  return !IN_RANGE (size, 0, 2 * UNITS_PER_WORD);
+}
+
+/* Implement TARGET_RETURN_IN_MEMORY.  */
+
+static bool
+loongarch_return_in_memory (const_tree type,
+			    const_tree fndecl ATTRIBUTE_UNUSED)
+{
+  CUMULATIVE_ARGS args;
+  cumulative_args_t cum = pack_cumulative_args (&args);
+
+  /* The rules for returning in memory are the same as for passing the
+     first named argument by reference.  */
+  memset (&args, 0, sizeof (args));
+  function_arg_info arg (const_cast<tree> (type), /*named=*/true);
+  return loongarch_pass_by_reference (cum, arg);
+}
+
+/* Implement TARGET_SETUP_INCOMING_VARARGS.  */
+
+static void
+loongarch_setup_incoming_varargs (cumulative_args_t cum,
+				  const function_arg_info &arg,
+				  int *pretend_size ATTRIBUTE_UNUSED,
+				  int no_rtl)
+{
+  CUMULATIVE_ARGS local_cum;
+  int gp_saved;
+
+  /* The caller has advanced CUM up to, but not beyond, the last named
+     argument.  Advance a local copy of CUM past the last "real" named
+     argument, to find out how many registers are left over.  */
+  local_cum = *get_cumulative_args (cum);
+  loongarch_function_arg_advance (pack_cumulative_args (&local_cum), arg);
+
+  /* Found out how many registers we need to save.  */
+  gp_saved = MAX_ARGS_IN_REGISTERS - local_cum.num_gprs;
+
+  if (!no_rtl && gp_saved > 0)
+    {
+      rtx ptr = plus_constant (Pmode, virtual_incoming_args_rtx,
+			       REG_PARM_STACK_SPACE (cfun->decl)
+				 - gp_saved * UNITS_PER_WORD);
+      rtx mem = gen_frame_mem (BLKmode, ptr);
+      set_mem_alias_set (mem, get_varargs_alias_set ());
+
+      move_block_from_reg (local_cum.num_gprs + GP_ARG_FIRST, mem, gp_saved);
+    }
+  if (REG_PARM_STACK_SPACE (cfun->decl) == 0)
+    cfun->machine->varargs_size = gp_saved * UNITS_PER_WORD;
+}
+
+/* Make the last instruction frame-related and note that it performs
+   the operation described by FRAME_PATTERN.  */
+
+static void
+loongarch_set_frame_expr (rtx frame_pattern)
+{
+  rtx insn;
+
+  insn = get_last_insn ();
+  RTX_FRAME_RELATED_P (insn) = 1;
+  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR, frame_pattern,
+				      REG_NOTES (insn));
+}
+
+/* Return a frame-related rtx that stores REG at MEM.
+   REG must be a single register.  */
+
+static rtx
+loongarch_frame_set (rtx mem, rtx reg)
+{
+  rtx set = gen_rtx_SET (mem, reg);
+  RTX_FRAME_RELATED_P (set) = 1;
+  return set;
+}
+
+/* Return true if the current function must save register REGNO.  */
+
+static bool
+loongarch_save_reg_p (unsigned int regno)
+{
+  bool call_saved = !global_regs[regno] && !call_used_regs[regno];
+  bool might_clobber
+    = crtl->saves_all_registers || df_regs_ever_live_p (regno);
+
+  if (call_saved && might_clobber)
+    return true;
+
+  if (regno == HARD_FRAME_POINTER_REGNUM && frame_pointer_needed)
+    return true;
+
+  if (regno == RETURN_ADDR_REGNUM && crtl->calls_eh_return)
+    return true;
+
+  return false;
+}
+
+/* Determine which GPR save/restore routine to call.  */
+
+static unsigned
+loongarch_save_libcall_count (unsigned mask)
+{
+  for (unsigned n = GP_REG_LAST; n > GP_REG_FIRST; n--)
+    if (BITSET_P (mask, n))
+      return CALLEE_SAVED_REG_NUMBER (n) + 1;
+  abort ();
+}
+
+/* Populate the current function's loongarch_frame_info structure.
+
+   LoongArch stack frames grown downward.  High addresses are at the top.
+
+     +-------------------------------+
+     |				     |
+     |  incoming stack arguments     |
+     |				     |
+     +-------------------------------+ <-- incoming stack pointer
+     |				     |
+     |  callee-allocated save area   |
+     |  for arguments that are       |
+     |  split between registers and  |
+     |  the stack		     |
+     |				     |
+     +-------------------------------+ <-- arg_pointer_rtx (virtual)
+     |				     |
+     |  callee-allocated save area   |
+     |  for register varargs	     |
+     |				     |
+     +-------------------------------+ <-- hard_frame_pointer_rtx;
+     |				     |     stack_pointer_rtx + gp_sp_offset
+     |  GPR save area		     |       + UNITS_PER_WORD
+     |				     |
+     +-------------------------------+ <-- stack_pointer_rtx + fp_sp_offset
+     |				     |       + UNITS_PER_HWVALUE
+     |  FPR save area		     |
+     |				     |
+     +-------------------------------+ <-- frame_pointer_rtx (virtual)
+     |				     |
+     |  local variables		     |
+     |				     |
+   P +-------------------------------+
+     |				     |
+     |  outgoing stack arguments     |
+     |				     |
+     +-------------------------------+ <-- stack_pointer_rtx
+
+   Dynamic stack allocations such as alloca insert data at point P.
+   They decrease stack_pointer_rtx but leave frame_pointer_rtx and
+   hard_frame_pointer_rtx unchanged.  */
+
+static void
+loongarch_compute_frame_info (void)
+{
+  struct loongarch_frame_info *frame;
+  HOST_WIDE_INT offset;
+  unsigned int regno, i, num_x_saved = 0, num_f_saved = 0;
+
+  frame = &cfun->machine->frame;
+  memset (frame, 0, sizeof (*frame));
+
+  /* Find out which GPRs we need to save.  */
+  for (regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
+    if (loongarch_save_reg_p (regno))
+      frame->mask |= 1 << (regno - GP_REG_FIRST), num_x_saved++;
+
+  /* If this function calls eh_return, we must also save and restore the
+     EH data registers.  */
+  if (crtl->calls_eh_return)
+    for (i = 0; (regno = EH_RETURN_DATA_REGNO (i)) != INVALID_REGNUM; i++)
+      frame->mask |= 1 << (regno - GP_REG_FIRST), num_x_saved++;
+
+  /* Find out which FPRs we need to save.  This loop must iterate over
+     the same space as its companion in loongarch_for_each_saved_reg.  */
+  if (TARGET_HARD_FLOAT)
+    for (regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++)
+      if (loongarch_save_reg_p (regno))
+	frame->fmask |= 1 << (regno - FP_REG_FIRST), num_f_saved++;
+
+  /* At the bottom of the frame are any outgoing stack arguments.  */
+  offset = LARCH_STACK_ALIGN (crtl->outgoing_args_size);
+  /* Next are local stack variables.  */
+  offset += LARCH_STACK_ALIGN (get_frame_size ());
+  /* The virtual frame pointer points above the local variables.  */
+  frame->frame_pointer_offset = offset;
+  /* Next are the callee-saved FPRs.  */
+  if (frame->fmask)
+    offset += LARCH_STACK_ALIGN (num_f_saved * UNITS_PER_FP_REG);
+  frame->fp_sp_offset = offset - UNITS_PER_FP_REG;
+  /* Next are the callee-saved GPRs.  */
+  if (frame->mask)
+    {
+      unsigned x_save_size = LARCH_STACK_ALIGN (num_x_saved * UNITS_PER_WORD);
+      unsigned num_save_restore
+	= 1 + loongarch_save_libcall_count (frame->mask);
+
+      /* Only use save/restore routines if they don't alter the stack size.  */
+      if (LARCH_STACK_ALIGN (num_save_restore * UNITS_PER_WORD) == x_save_size)
+	frame->save_libcall_adjustment = x_save_size;
+
+      offset += x_save_size;
+    }
+  frame->gp_sp_offset = offset - UNITS_PER_WORD;
+  /* The hard frame pointer points above the callee-saved GPRs.  */
+  frame->hard_frame_pointer_offset = offset;
+  /* Above the hard frame pointer is the callee-allocated varags save area.  */
+  offset += LARCH_STACK_ALIGN (cfun->machine->varargs_size);
+  /* Next is the callee-allocated area for pretend stack arguments.  */
+  offset += LARCH_STACK_ALIGN (crtl->args.pretend_args_size);
+  /* Arg pointer must be below pretend args, but must be above alignment
+     padding.  */
+  frame->arg_pointer_offset = offset - crtl->args.pretend_args_size;
+  frame->total_size = offset;
+  /* Next points the incoming stack pointer and any incoming arguments.  */
+
+  /* Only use save/restore routines when the GPRs are atop the frame.  */
+  if (frame->hard_frame_pointer_offset != frame->total_size)
+    frame->save_libcall_adjustment = 0;
+}
+
+/* Implement INITIAL_ELIMINATION_OFFSET.  FROM is either the frame pointer
+   or argument pointer.  TO is either the stack pointer or hard frame
+   pointer.  */
+
+HOST_WIDE_INT
+loongarch_initial_elimination_offset (int from, int to)
+{
+  HOST_WIDE_INT src, dest;
+
+  loongarch_compute_frame_info ();
+
+  if (to == HARD_FRAME_POINTER_REGNUM)
+    dest = cfun->machine->frame.hard_frame_pointer_offset;
+  else if (to == STACK_POINTER_REGNUM)
+    dest = 0; /* The stack pointer is the base of all offsets, hence 0.  */
+  else
+    gcc_unreachable ();
+
+  if (from == FRAME_POINTER_REGNUM)
+    src = cfun->machine->frame.frame_pointer_offset;
+  else if (from == ARG_POINTER_REGNUM)
+    src = cfun->machine->frame.arg_pointer_offset;
+  else
+    gcc_unreachable ();
+
+  return src - dest;
+}
+
+/* A function to save or store a register.  The first argument is the
+   register and the second is the stack slot.  */
+typedef void (*loongarch_save_restore_fn) (rtx, rtx);
+
+/* Use FN to save or restore register REGNO.  MODE is the register's
+   mode and OFFSET is the offset of its save slot from the current
+   stack pointer.  */
+
+static void
+loongarch_save_restore_reg (machine_mode mode, int regno, HOST_WIDE_INT offset,
+			    loongarch_save_restore_fn fn)
+{
+  rtx mem;
+
+  mem = gen_frame_mem (mode, plus_constant (Pmode, stack_pointer_rtx, offset));
+  fn (gen_rtx_REG (mode, regno), mem);
+}
+
+/* Call FN for each register that is saved by the current function.
+   SP_OFFSET is the offset of the current stack pointer from the start
+   of the frame.  */
+
+static void
+loongarch_for_each_saved_reg (HOST_WIDE_INT sp_offset,
+			      loongarch_save_restore_fn fn)
+{
+  HOST_WIDE_INT offset;
+
+  /* Save the link register and s-registers.  */
+  offset = cfun->machine->frame.gp_sp_offset - sp_offset;
+  for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
+    if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
+      {
+	loongarch_save_restore_reg (word_mode, regno, offset, fn);
+	offset -= UNITS_PER_WORD;
+      }
+
+  /* This loop must iterate over the same space as its companion in
+     loongarch_compute_frame_info.  */
+  offset = cfun->machine->frame.fp_sp_offset - sp_offset;
+  for (int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++)
+    if (BITSET_P (cfun->machine->frame.fmask, regno - FP_REG_FIRST))
+      {
+	machine_mode mode = TARGET_DOUBLE_FLOAT ? DFmode : SFmode;
+
+	loongarch_save_restore_reg (mode, regno, offset, fn);
+	offset -= GET_MODE_SIZE (mode);
+      }
+}
+
+/* Emit a move from SRC to DEST.  Assume that the move expanders can
+   handle all moves if !can_create_pseudo_p ().  The distinction is
+   important because, unlike emit_move_insn, the move expanders know
+   how to force Pmode objects into the constant pool even when the
+   constant pool address is not itself legitimate.  */
+
+rtx
+loongarch_emit_move (rtx dest, rtx src)
+{
+  return (can_create_pseudo_p () ? emit_move_insn (dest, src)
+				 : emit_move_insn_1 (dest, src));
+}
+
+/* Save register REG to MEM.  Make the instruction frame-related.  */
+
+static void
+loongarch_save_reg (rtx reg, rtx mem)
+{
+  loongarch_emit_move (mem, reg);
+  loongarch_set_frame_expr (loongarch_frame_set (mem, reg));
+}
+
+/* Restore register REG from MEM.  */
+
+static void
+loongarch_restore_reg (rtx reg, rtx mem)
+{
+  rtx insn = loongarch_emit_move (reg, mem);
+  rtx dwarf = NULL_RTX;
+  dwarf = alloc_reg_note (REG_CFA_RESTORE, reg, dwarf);
+  REG_NOTES (insn) = dwarf;
+
+  RTX_FRAME_RELATED_P (insn) = 1;
+}
+
+/* For stack frames that can't be allocated with a single ADDI instruction,
+   compute the best value to initially allocate.  It must at a minimum
+   allocate enough space to spill the callee-saved registers.  */
+
+static HOST_WIDE_INT
+loongarch_first_stack_step (struct loongarch_frame_info *frame)
+{
+  if (IMM12_OPERAND (frame->total_size))
+    return frame->total_size;
+
+  HOST_WIDE_INT min_first_step
+    = LARCH_STACK_ALIGN (frame->total_size - frame->fp_sp_offset);
+  HOST_WIDE_INT max_first_step = IMM_REACH / 2 - PREFERRED_STACK_BOUNDARY / 8;
+  HOST_WIDE_INT min_second_step = frame->total_size - max_first_step;
+  gcc_assert (min_first_step <= max_first_step);
+
+  /* As an optimization, use the least-significant bits of the total frame
+     size, so that the second adjustment step is just LU12I + ADD.  */
+  if (!IMM12_OPERAND (min_second_step)
+      && frame->total_size % IMM_REACH < IMM_REACH / 2
+      && frame->total_size % IMM_REACH >= min_first_step)
+    return frame->total_size % IMM_REACH;
+
+  return max_first_step;
+}
+
+static void
+loongarch_emit_stack_tie (void)
+{
+  if (Pmode == SImode)
+    emit_insn (gen_stack_tiesi (stack_pointer_rtx, hard_frame_pointer_rtx));
+  else
+    emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));
+}
+
+#define PROBE_INTERVAL (1 << STACK_CHECK_PROBE_INTERVAL_EXP)
+
+#if PROBE_INTERVAL > 16384
+#error Cannot use indexed addressing mode for stack probing
+#endif
+
+/* Emit code to probe a range of stack addresses from FIRST to FIRST+SIZE,
+   inclusive.  These are offsets from the current stack pointer.  */
+
+static void
+loongarch_emit_probe_stack_range (HOST_WIDE_INT first, HOST_WIDE_INT size)
+{
+  /* See if we have a constant small number of probes to generate.  If so,
+     that's the easy case.  */
+  if ((TARGET_64BIT && (first + size <= 32768))
+      || (!TARGET_64BIT && (first + size <= 2048)))
+    {
+      HOST_WIDE_INT i;
+
+      /* Probe at FIRST + N * PROBE_INTERVAL for values of N from 1 until
+	 it exceeds SIZE.  If only one probe is needed, this will not
+	 generate any code.  Then probe at FIRST + SIZE.  */
+      for (i = PROBE_INTERVAL; i < size; i += PROBE_INTERVAL)
+	emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
+					 -(first + i)));
+
+      emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
+				       -(first + size)));
+    }
+
+  /* Otherwise, do the same as above, but in a loop.  Note that we must be
+     extra careful with variables wrapping around because we might be at
+     the very top (or the very bottom) of the address space and we have
+     to be able to handle this case properly; in particular, we use an
+     equality test for the loop condition.  */
+  else
+    {
+      HOST_WIDE_INT rounded_size;
+      rtx r13 = LARCH_PROLOGUE_TEMP (Pmode);
+      rtx r12 = LARCH_PROLOGUE_TEMP2 (Pmode);
+      rtx r14 = LARCH_PROLOGUE_TEMP3 (Pmode);
+
+      /* Sanity check for the addressing mode we're going to use.  */
+      gcc_assert (first <= 16384);
+
+
+      /* Step 1: round SIZE to the previous multiple of the interval.  */
+
+      rounded_size = ROUND_DOWN (size, PROBE_INTERVAL);
+
+      /* TEST_ADDR = SP + FIRST */
+      if (first != 0)
+	{
+	  emit_move_insn (r14, GEN_INT (first));
+	  emit_insn (gen_rtx_SET (r13, gen_rtx_MINUS (Pmode,
+						      stack_pointer_rtx,
+						      r14)));
+	}
+      else
+	emit_move_insn (r13, stack_pointer_rtx);
+
+      /* Step 2: compute initial and final value of the loop counter.  */
+
+      emit_move_insn (r14, GEN_INT (PROBE_INTERVAL));
+      /* LAST_ADDR = SP + FIRST + ROUNDED_SIZE.  */
+      if (rounded_size == 0)
+	emit_move_insn (r12, r13);
+      else
+	{
+	  emit_move_insn (r12, GEN_INT (rounded_size));
+	  emit_insn (gen_rtx_SET (r12, gen_rtx_MINUS (Pmode, r13, r12)));
+	  /* Step 3: the loop
+
+	     do
+	     {
+	     TEST_ADDR = TEST_ADDR + PROBE_INTERVAL
+	     probe at TEST_ADDR
+	     }
+	     while (TEST_ADDR != LAST_ADDR)
+
+	     probes at FIRST + N * PROBE_INTERVAL for values of N from 1
+	     until it is equal to ROUNDED_SIZE.  */
+
+	  emit_insn (PMODE_INSN (gen_probe_stack_range, (r13, r13, r12, r14)));
+	}
+
+      /* Step 4: probe at FIRST + SIZE if we cannot assert at compile-time
+	 that SIZE is equal to ROUNDED_SIZE.  */
+
+      if (size != rounded_size)
+	{
+	  if (TARGET_64BIT)
+	    emit_stack_probe (plus_constant (Pmode, r12, rounded_size - size));
+	  else
+	    {
+	      HOST_WIDE_INT i;
+	      for (i = 2048; i < (size - rounded_size); i += 2048)
+		{
+		  emit_stack_probe (plus_constant (Pmode, r12, -i));
+		  emit_insn (gen_rtx_SET (r12,
+					  plus_constant (Pmode, r12, -2048)));
+		}
+	      rtx r1 = plus_constant (Pmode, r12,
+				      -(size - rounded_size - i + 2048));
+	      emit_stack_probe (r1);
+	    }
+	}
+    }
+
+  /* Make sure nothing is scheduled before we are done.  */
+  emit_insn (gen_blockage ());
+}
+
+/* Probe a range of stack addresses from REG1 to REG2 inclusive.  These are
+   absolute addresses.  */
+const char *
+loongarch_output_probe_stack_range (rtx reg1, rtx reg2, rtx reg3)
+{
+  static int labelno = 0;
+  char loop_lab[32], tmp[64];
+  rtx xops[3];
+
+  ASM_GENERATE_INTERNAL_LABEL (loop_lab, "LPSRL", labelno++);
+
+  /* Loop.  */
+  ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, loop_lab);
+
+  /* TEST_ADDR = TEST_ADDR + PROBE_INTERVAL.  */
+  xops[0] = reg1;
+  xops[1] = GEN_INT (-PROBE_INTERVAL);
+  xops[2] = reg3;
+  if (TARGET_64BIT)
+    output_asm_insn ("sub.d\t%0,%0,%2", xops);
+  else
+    output_asm_insn ("sub.w\t%0,%0,%2", xops);
+
+  /* Probe at TEST_ADDR, test if TEST_ADDR == LAST_ADDR and branch.  */
+  xops[1] = reg2;
+  strcpy (tmp, "bne\t%0,%1,");
+  if (TARGET_64BIT)
+    output_asm_insn ("st.d\t$r0,%0,0", xops);
+  else
+    output_asm_insn ("st.w\t$r0,%0,0", xops);
+  output_asm_insn (strcat (tmp, &loop_lab[1]), xops);
+
+  return "";
+}
+
+/* Expand the "prologue" pattern.  */
+
+void
+loongarch_expand_prologue (void)
+{
+  struct loongarch_frame_info *frame = &cfun->machine->frame;
+  HOST_WIDE_INT size = frame->total_size;
+  HOST_WIDE_INT tmp;
+  unsigned mask = frame->mask;
+  rtx insn;
+
+  if (flag_stack_usage_info)
+    current_function_static_stack_size = size;
+
+  if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
+      || flag_stack_clash_protection)
+    {
+      if (crtl->is_leaf && !cfun->calls_alloca)
+	{
+	  if (size > PROBE_INTERVAL && size > get_stack_check_protect ())
+	    {
+	      tmp = size - get_stack_check_protect ();
+	      loongarch_emit_probe_stack_range (get_stack_check_protect (),
+						tmp);
+	    }
+	}
+      else if (size > 0)
+	loongarch_emit_probe_stack_range (get_stack_check_protect (), size);
+    }
+
+  /* Save the registers.  */
+  if ((frame->mask | frame->fmask) != 0)
+    {
+      HOST_WIDE_INT step1 = MIN (size, loongarch_first_stack_step (frame));
+
+      insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+			    GEN_INT (-step1));
+      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+      size -= step1;
+      loongarch_for_each_saved_reg (size, loongarch_save_reg);
+    }
+
+  frame->mask = mask; /* Undo the above fib.  */
+
+  /* Set up the frame pointer, if we're using one.  */
+  if (frame_pointer_needed)
+    {
+      insn = gen_add3_insn (hard_frame_pointer_rtx, stack_pointer_rtx,
+			    GEN_INT (frame->hard_frame_pointer_offset - size));
+      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+
+      loongarch_emit_stack_tie ();
+    }
+
+  /* Allocate the rest of the frame.  */
+  if (size > 0)
+    {
+      if (IMM12_OPERAND (-size))
+	{
+	  insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+				GEN_INT (-size));
+	  RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
+	}
+      else
+	{
+	  loongarch_emit_move (LARCH_PROLOGUE_TEMP (Pmode), GEN_INT (-size));
+	  emit_insn (gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+				    LARCH_PROLOGUE_TEMP (Pmode)));
+
+	  /* Describe the effect of the previous instructions.  */
+	  insn = plus_constant (Pmode, stack_pointer_rtx, -size);
+	  insn = gen_rtx_SET (stack_pointer_rtx, insn);
+	  loongarch_set_frame_expr (insn);
+	}
+    }
+}
+
+/* Return nonzero if this function is known to have a null epilogue.
+   This allows the optimizer to omit jumps to jumps if no stack
+   was created.  */
+
+bool
+loongarch_can_use_return_insn (void)
+{
+  return reload_completed && cfun->machine->frame.total_size == 0;
+}
+
+/* Expand an "epilogue" or "sibcall_epilogue" pattern; SIBCALL_P
+   says which.  */
+
+void
+loongarch_expand_epilogue (bool sibcall_p)
+{
+  /* Split the frame into two.  STEP1 is the amount of stack we should
+     deallocate before restoring the registers.  STEP2 is the amount we
+     should deallocate afterwards.
+
+     Start off by assuming that no registers need to be restored.  */
+  struct loongarch_frame_info *frame = &cfun->machine->frame;
+  HOST_WIDE_INT step1 = frame->total_size;
+  HOST_WIDE_INT step2 = 0;
+  rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
+  rtx insn;
+
+  /* We need to add memory barrier to prevent read from deallocated stack.  */
+  bool need_barrier_p
+    = (get_frame_size () + cfun->machine->frame.arg_pointer_offset) != 0;
+
+  if (!sibcall_p && loongarch_can_use_return_insn ())
+    {
+      emit_jump_insn (gen_return ());
+      return;
+    }
+
+  /* Move past any dynamic stack allocations.  */
+  if (cfun->calls_alloca)
+    {
+      /* Emit a barrier to prevent loads from a deallocated stack.  */
+      loongarch_emit_stack_tie ();
+      need_barrier_p = false;
+
+      rtx adjust = GEN_INT (-frame->hard_frame_pointer_offset);
+      if (!IMM12_OPERAND (INTVAL (adjust)))
+	{
+	  loongarch_emit_move (LARCH_PROLOGUE_TEMP (Pmode), adjust);
+	  adjust = LARCH_PROLOGUE_TEMP (Pmode);
+	}
+
+      insn = emit_insn (gen_add3_insn (stack_pointer_rtx,
+				       hard_frame_pointer_rtx,
+				       adjust));
+
+      rtx dwarf = NULL_RTX;
+      rtx minus_offset = NULL_RTX;
+      minus_offset = GEN_INT (-frame->hard_frame_pointer_offset);
+      rtx cfa_adjust_value = gen_rtx_PLUS (Pmode,
+					   hard_frame_pointer_rtx,
+					   minus_offset);
+
+      rtx cfa_adjust_rtx = gen_rtx_SET (stack_pointer_rtx, cfa_adjust_value);
+      dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, cfa_adjust_rtx, dwarf);
+      RTX_FRAME_RELATED_P (insn) = 1;
+
+      REG_NOTES (insn) = dwarf;
+    }
+
+  /* If we need to restore registers, deallocate as much stack as
+     possible in the second step without going out of range.  */
+  if ((frame->mask | frame->fmask) != 0)
+    {
+      step2 = loongarch_first_stack_step (frame);
+      step1 -= step2;
+    }
+
+  /* Set TARGET to BASE + STEP1.  */
+  if (step1 > 0)
+    {
+      /* Emit a barrier to prevent loads from a deallocated stack.  */
+      loongarch_emit_stack_tie ();
+      need_barrier_p = false;
+
+      /* Get an rtx for STEP1 that we can add to BASE.  */
+      rtx adjust = GEN_INT (step1);
+      if (!IMM12_OPERAND (step1))
+	{
+	  loongarch_emit_move (LARCH_PROLOGUE_TEMP (Pmode), adjust);
+	  adjust = LARCH_PROLOGUE_TEMP (Pmode);
+	}
+
+      insn = emit_insn (gen_add3_insn (stack_pointer_rtx,
+				       stack_pointer_rtx,
+				       adjust));
+
+      rtx dwarf = NULL_RTX;
+      rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx,
+					 GEN_INT (step2));
+
+      dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
+      RTX_FRAME_RELATED_P (insn) = 1;
+
+      REG_NOTES (insn) = dwarf;
+    }
+
+  /* Restore the registers.  */
+  loongarch_for_each_saved_reg (frame->total_size - step2,
+				loongarch_restore_reg);
+
+  if (need_barrier_p)
+    loongarch_emit_stack_tie ();
+
+  /* Deallocate the final bit of the frame.  */
+  if (step2 > 0)
+    {
+      insn = emit_insn (gen_add3_insn (stack_pointer_rtx,
+				       stack_pointer_rtx,
+				       GEN_INT (step2)));
+
+      rtx dwarf = NULL_RTX;
+      rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx, const0_rtx);
+      dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf);
+      RTX_FRAME_RELATED_P (insn) = 1;
+
+      REG_NOTES (insn) = dwarf;
+    }
+
+  /* Add in the __builtin_eh_return stack adjustment.  */
+  if (crtl->calls_eh_return)
+    emit_insn (gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
+			      EH_RETURN_STACKADJ_RTX));
+
+  if (!sibcall_p)
+    emit_jump_insn (gen_simple_return_internal (ra));
+}
+
+#define LU32I_B (0xfffffULL << 32)
+#define LU52I_B (0xfffULL << 52)
+
+/* Fill CODES with a sequence of rtl operations to load VALUE.
+   Return the number of operations needed.  */
+
+static unsigned int
+loongarch_build_integer (struct loongarch_integer_op *codes,
+			 HOST_WIDE_INT value)
+
+{
+  unsigned int cost = 0;
+
+  /* Get the lower 32 bits of the value.  */
+  HOST_WIDE_INT low_part = TARGET_64BIT ? value << 32 >> 32 : value;
+
+  if (IMM12_OPERAND (low_part) || IMM12_OPERAND_UNSIGNED (low_part))
+    {
+      /* The value of the lower 32 bit be loaded with one instruction.
+	 lu12i.w.  */
+      codes[0].code = UNKNOWN;
+      codes[0].method = METHOD_NORMAL;
+      codes[0].value = low_part;
+      cost++;
+    }
+  else
+    {
+      /* lu12i.w + ior.  */
+      codes[0].code = UNKNOWN;
+      codes[0].method = METHOD_NORMAL;
+      codes[0].value = low_part & ~(IMM_REACH - 1);
+      cost++;
+      HOST_WIDE_INT iorv = low_part & (IMM_REACH - 1);
+      if (iorv != 0)
+	{
+	  codes[1].code = IOR;
+	  codes[1].method = METHOD_NORMAL;
+	  codes[1].value = iorv;
+	  cost++;
+	}
+    }
+
+  if (TARGET_64BIT)
+    {
+      bool lu32i[2] = {(value & LU32I_B) == 0, (value & LU32I_B) == LU32I_B};
+      bool lu52i[2] = {(value & LU52I_B) == 0, (value & LU52I_B) == LU52I_B};
+
+      int sign31 = (value & (1UL << 31)) >> 31;
+      /* Determine whether the upper 32 bits are sign-extended from the lower
+	 32 bits. If it is, the instructions to load the high order can be
+	 ommitted.  */
+      if (lu32i[sign31] && lu52i[sign31])
+	return cost;
+      /* Determine whether bits 32-51 are sign-extended from the lower 32
+	 bits. If so, directly load 52-63 bits.  */
+      else if (lu32i[sign31])
+	{
+	  codes[cost].method = METHOD_LU52I;
+	  codes[cost].value = (value >> 52) << 52;
+	  return cost + 1;
+	}
+
+      codes[cost].method = METHOD_LU32I;
+      codes[cost].value = ((value << 12) >> 44) << 32;
+      cost++;
+
+      /* Determine whether the 52-61 bits are sign-extended from the low order,
+	 and if not, load the 52-61 bits.  */
+      if (!lu52i[(value & (1ULL << 51)) >> 51])
+	{
+	  codes[cost].method = METHOD_LU52I;
+	  codes[cost].value = (value >> 52) << 52;
+	  cost++;
+	}
+    }
+
+  gcc_assert (cost <= LARCH_MAX_INTEGER_OPS);
+
+  return cost;
+}
+
+/* Fill CODES with a sequence of rtl operations to load VALUE.
+   Return the number of operations needed.
+   Split interger in loongarch_output_move.  */
+
+static unsigned int
+loongarch_integer_cost (HOST_WIDE_INT value)
+{
+  struct loongarch_integer_op codes[LARCH_MAX_INTEGER_OPS];
+  return loongarch_build_integer (codes, value);
+}
+
+/* Implement TARGET_LEGITIMATE_CONSTANT_P.  */
+
+static bool
+loongarch_legitimate_constant_p (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
+{
+  return loongarch_const_insns (x) > 0;
+}
+
+/* Return true if X is a thread-local symbol.  */
+
+static bool
+loongarch_tls_symbol_p (rtx x)
+{
+  return GET_CODE (x) == SYMBOL_REF && SYMBOL_REF_TLS_MODEL (x) != 0;
+}
+
+/* Return true if SYMBOL_REF X is associated with a global symbol
+   (in the STB_GLOBAL sense).  */
+
+bool
+loongarch_global_symbol_p (const_rtx x)
+{
+  if (GET_CODE (x) == LABEL_REF)
+    return false;
+
+  const_tree decl = SYMBOL_REF_DECL (x);
+
+  if (!decl)
+    return !SYMBOL_REF_LOCAL_P (x) || SYMBOL_REF_EXTERNAL_P (x);
+
+  /* Weakref symbols are not TREE_PUBLIC, but their targets are global
+     or weak symbols.  Relocations in the object file will be against
+     the target symbol, so it's that symbol's binding that matters here.  */
+  return DECL_P (decl) && (TREE_PUBLIC (decl) || DECL_WEAK (decl));
+}
+
+bool
+loongarch_global_symbol_noweak_p (const_rtx x)
+{
+  if (GET_CODE (x) == LABEL_REF)
+    return false;
+
+  const_tree decl = SYMBOL_REF_DECL (x);
+
+  if (!decl)
+    return !SYMBOL_REF_LOCAL_P (x) || SYMBOL_REF_EXTERNAL_P (x);
+
+  /* Weakref symbols are not TREE_PUBLIC, but their targets are global
+     or weak symbols.  Relocations in the object file will be against
+     the target symbol, so it's that symbol's binding that matters here.  */
+  return DECL_P (decl) && TREE_PUBLIC (decl);
+}
+
+bool
+loongarch_weak_symbol_p (const_rtx x)
+{
+  const_tree decl;
+  if (GET_CODE (x) == LABEL_REF || !(decl = SYMBOL_REF_DECL (x)))
+    return false;
+  return DECL_P (decl) && DECL_WEAK (decl);
+}
+
+/* Return true if SYMBOL_REF X binds locally.  */
+
+bool
+loongarch_symbol_binds_local_p (const_rtx x)
+{
+  if (GET_CODE (x) == LABEL_REF)
+    return false;
+
+  return (SYMBOL_REF_DECL (x) ? targetm.binds_local_p (SYMBOL_REF_DECL (x))
+			      : SYMBOL_REF_LOCAL_P (x));
+}
+
+/* Return true if rtx constants of mode MODE should be put into a small
+   data section.  */
+
+static bool
+loongarch_rtx_constant_in_small_data_p (machine_mode mode)
+{
+  return (GET_MODE_SIZE (mode) <= g_switch_value);
+}
+
+/* Return the method that should be used to access SYMBOL_REF or
+   LABEL_REF X in context CONTEXT.  */
+
+static enum loongarch_symbol_type
+loongarch_classify_symbol (const_rtx x,
+			   enum loongarch_symbol_context context  \
+			   ATTRIBUTE_UNUSED)
+{
+  if (GET_CODE (x) == LABEL_REF)
+    {
+      return SYMBOL_GOT_DISP;
+    }
+
+  gcc_assert (GET_CODE (x) == SYMBOL_REF);
+
+  if (SYMBOL_REF_TLS_MODEL (x))
+    return SYMBOL_TLS;
+
+  if (GET_CODE (x) == SYMBOL_REF)
+    return SYMBOL_GOT_DISP;
+
+  return SYMBOL_GOT_DISP;
+}
+
+/* Return true if X is a symbolic constant that can be used in context
+   CONTEXT.  If it is, store the type of the symbol in *SYMBOL_TYPE.  */
+
+bool
+loongarch_symbolic_constant_p (rtx x, enum loongarch_symbol_context context,
+			       enum loongarch_symbol_type *symbol_type)
+{
+  rtx offset;
+
+  split_const (x, &x, &offset);
+  if (UNSPEC_ADDRESS_P (x))
+    {
+      *symbol_type = UNSPEC_ADDRESS_TYPE (x);
+      x = UNSPEC_ADDRESS (x);
+    }
+  else if (GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF)
+    {
+      *symbol_type = loongarch_classify_symbol (x, context);
+      if (*symbol_type == SYMBOL_TLS)
+	return true;
+    }
+  else
+    return false;
+
+  if (offset == const0_rtx)
+    return true;
+
+  /* Check whether a nonzero offset is valid for the underlying
+     relocations.  */
+  switch (*symbol_type)
+    {
+      /* Fall through.  */
+
+    case SYMBOL_GOT_DISP:
+    case SYMBOL_TLSGD:
+    case SYMBOL_TLSLDM:
+    case SYMBOL_TLS:
+      return false;
+    }
+  gcc_unreachable ();
+}
+
+/* Like loongarch_symbol_insns We rely on the fact that, in the worst case.  */
+
+static int
+loongarch_symbol_insns_1 (enum loongarch_symbol_type type, machine_mode mode)
+{
+  switch (type)
+    {
+    case SYMBOL_GOT_DISP:
+      /* The constant will have to be loaded from the GOT before it
+	 is used in an address.  */
+      if (mode != MAX_MACHINE_MODE)
+	return 0;
+
+      /* Fall through.  */
+
+      return 3;
+
+    case SYMBOL_TLSGD:
+    case SYMBOL_TLSLDM:
+      return 1;
+
+    case SYMBOL_TLS:
+      /* We don't treat a bare TLS symbol as a constant.  */
+      return 0;
+    }
+  gcc_unreachable ();
+}
+
+/* If MODE is MAX_MACHINE_MODE, return the number of instructions needed
+   to load symbols of type TYPE into a register.  Return 0 if the given
+   type of symbol cannot be used as an immediate operand.
+
+   Otherwise, return the number of instructions needed to load or store
+   values of mode MODE to or from addresses of type TYPE.  Return 0 if
+   the given type of symbol is not valid in addresses.
+
+   In both cases, instruction counts are based off BASE_INSN_LENGTH.  */
+
+static int
+loongarch_symbol_insns (enum loongarch_symbol_type type, machine_mode mode)
+{
+  return loongarch_symbol_insns_1 (type, mode) * (1);
+}
+
+/* Implement TARGET_CANNOT_FORCE_CONST_MEM.  */
+
+static bool
+loongarch_cannot_force_const_mem (machine_mode mode, rtx x)
+{
+  enum loongarch_symbol_type type;
+  rtx base, offset;
+
+  /* As an optimization, reject constants that loongarch_legitimize_move
+     can expand inline.
+
+     Suppose we have a multi-instruction sequence that loads constant C
+     into register R.  If R does not get allocated a hard register, and
+     R is used in an operand that allows both registers and memory
+     references, reload will consider forcing C into memory and using
+     one of the instruction's memory alternatives.  Returning false
+     here will force it to use an input reload instead.  */
+  if (CONST_INT_P (x) && loongarch_legitimate_constant_p (mode, x))
+    return true;
+
+  split_const (x, &base, &offset);
+  if (loongarch_symbolic_constant_p (base, SYMBOL_CONTEXT_LEA, &type))
+    {
+      /* The same optimization as for CONST_INT.  */
+      if (IMM12_INT (offset)
+	  && loongarch_symbol_insns (type, MAX_MACHINE_MODE) > 0)
+	return true;
+    }
+
+  /* TLS symbols must be computed by loongarch_legitimize_move.  */
+  if (tls_referenced_p (x))
+    return true;
+
+  return false;
+}
+
+/* Return true if register REGNO is a valid base register for mode MODE.
+   STRICT_P is true if REG_OK_STRICT is in effect.  */
+
+int
+loongarch_regno_mode_ok_for_base_p (int regno,
+				    machine_mode mode ATTRIBUTE_UNUSED,
+				    bool strict_p)
+{
+  if (!HARD_REGISTER_NUM_P (regno))
+    {
+      if (!strict_p)
+	return true;
+      regno = reg_renumber[regno];
+    }
+
+  /* These fake registers will be eliminated to either the stack or
+     hard frame pointer, both of which are usually valid base registers.
+     Reload deals with the cases where the eliminated form isn't valid.  */
+  if (regno == ARG_POINTER_REGNUM || regno == FRAME_POINTER_REGNUM)
+    return true;
+
+  return GP_REG_P (regno);
+}
+
+/* Return true if X is a valid base register for mode MODE.
+   STRICT_P is true if REG_OK_STRICT is in effect.  */
+
+static bool
+loongarch_valid_base_register_p (rtx x, machine_mode mode, bool strict_p)
+{
+  if (!strict_p && GET_CODE (x) == SUBREG)
+    x = SUBREG_REG (x);
+
+  return (REG_P (x)
+	  && loongarch_regno_mode_ok_for_base_p (REGNO (x), mode, strict_p));
+}
+
+/* Return true if, for every base register BASE_REG, (plus BASE_REG X)
+   can address a value of mode MODE.  */
+
+static bool
+loongarch_valid_offset_p (rtx x, machine_mode mode)
+{
+  /* Check that X is a signed 12-bit number,
+   * or check that X is a signed 16-bit number
+   * and offset 4 byte aligned.  */
+  if (!(const_arith_operand (x, Pmode)
+	|| ((mode == E_SImode || mode == E_DImode)
+	    && const_imm16_operand (x, Pmode)
+	    && (loongarch_signed_immediate_p (INTVAL (x), 14, 2)))))
+    return false;
+
+  /* We may need to split multiword moves, so make sure that every word
+     is accessible.  */
+  if (GET_MODE_SIZE (mode) > UNITS_PER_WORD
+      && !IMM12_OPERAND (INTVAL (x) + GET_MODE_SIZE (mode) - UNITS_PER_WORD))
+    return false;
+
+  return true;
+}
+
+static bool
+loongarch_valid_index_p (struct loongarch_address_info *info, rtx x,
+			  machine_mode mode, bool strict_p)
+{
+  rtx index;
+
+  if ((REG_P (x) || SUBREG_P (x))
+      && GET_MODE (x) == Pmode)
+    {
+      index = x;
+    }
+  else
+    return false;
+
+  if (!strict_p
+      && GET_CODE (index) == SUBREG
+      && contains_reg_of_mode[GENERAL_REGS][GET_MODE (SUBREG_REG (index))])
+    index = SUBREG_REG (index);
+
+  if (loongarch_valid_base_register_p (index, mode, strict_p))
+    {
+      info->type = ADDRESS_REG_REG;
+      info->offset = index;
+      return true;
+    }
+
+  return false;
+}
+
+/* Return true if X is a valid address for machine mode MODE.  If it is,
+   fill in INFO appropriately.  STRICT_P is true if REG_OK_STRICT is in
+   effect.  */
+
+static bool
+loongarch_classify_address (struct loongarch_address_info *info, rtx x,
+			    machine_mode mode, bool strict_p)
+{
+  switch (GET_CODE (x))
+    {
+    case REG:
+    case SUBREG:
+      info->type = ADDRESS_REG;
+      info->reg = x;
+      info->offset = const0_rtx;
+      return loongarch_valid_base_register_p (info->reg, mode, strict_p);
+
+    case PLUS:
+      if (loongarch_valid_base_register_p (XEXP (x, 0), mode, strict_p)
+	  && loongarch_valid_index_p (info, XEXP (x, 1), mode, strict_p))
+	{
+	  info->reg = XEXP (x, 0);
+	  return true;
+	}
+
+      if (loongarch_valid_base_register_p (XEXP (x, 1), mode, strict_p)
+	 && loongarch_valid_index_p (info, XEXP (x, 0), mode, strict_p))
+	{
+	  info->reg = XEXP (x, 1);
+	  return true;
+	}
+
+      info->type = ADDRESS_REG;
+      info->reg = XEXP (x, 0);
+      info->offset = XEXP (x, 1);
+      return (loongarch_valid_base_register_p (info->reg, mode, strict_p)
+	      && loongarch_valid_offset_p (info->offset, mode));
+    default:
+      return false;
+    }
+}
+
+/* Implement TARGET_LEGITIMATE_ADDRESS_P.  */
+
+static bool
+loongarch_legitimate_address_p (machine_mode mode, rtx x, bool strict_p)
+{
+  struct loongarch_address_info addr;
+
+  return loongarch_classify_address (&addr, x, mode, strict_p);
+}
+
+/* Return true if ADDR matches the pattern for the indexed address
+   instruction.  */
+
+static bool
+loongarch_index_address_p (rtx addr, machine_mode mode ATTRIBUTE_UNUSED)
+{
+  if (GET_CODE (addr) != PLUS
+      || !REG_P (XEXP (addr, 0))
+      || !REG_P (XEXP (addr, 1)))
+    return false;
+  return true;
+}
+
+/* Return the number of instructions needed to load or store a value
+   of mode MODE at address X, assuming that BASE_INSN_LENGTH is the
+   length of one instruction.  Return 0 if X isn't valid for MODE.
+   Assume that multiword moves may need to be split into word moves
+   if MIGHT_SPLIT_P, otherwise assume that a single load or store is
+   enough.  */
+
+int
+loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p)
+{
+  struct loongarch_address_info addr;
+  int factor;
+
+  if (!loongarch_classify_address (&addr, x, mode, false))
+    return 0;
+
+  /* BLKmode is used for single unaligned loads and stores and should
+     not count as a multiword mode.  (GET_MODE_SIZE (BLKmode) is pretty
+     meaningless, so we have to single it out as a special case one way
+     or the other.)  */
+  if (mode != BLKmode && might_split_p)
+    factor = (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+  else
+    factor = 1;
+
+  if (loongarch_classify_address (&addr, x, mode, false))
+    switch (addr.type)
+      {
+      case ADDRESS_REG:
+	return factor;
+
+      case ADDRESS_REG_REG:
+	return factor;
+
+      case ADDRESS_CONST_INT:
+	return factor;
+
+      case ADDRESS_SYMBOLIC:
+	return factor * loongarch_symbol_insns (addr.symbol_type, mode);
+      }
+  return 0;
+}
+
+/* Return true if X fits within an unsigned field of BITS bits that is
+   shifted left SHIFT bits before being used.  */
+
+bool
+loongarch_unsigned_immediate_p (unsigned HOST_WIDE_INT x, int bits,
+				int shift = 0)
+{
+  return (x & ((1 << shift) - 1)) == 0 && x < ((unsigned) 1 << (shift + bits));
+}
+
+/* Return true if X fits within a signed field of BITS bits that is
+   shifted left SHIFT bits before being used.  */
+
+bool
+loongarch_signed_immediate_p (unsigned HOST_WIDE_INT x, int bits,
+			      int shift = 0)
+{
+  x += 1 << (bits + shift - 1);
+  return loongarch_unsigned_immediate_p (x, bits, shift);
+}
+
+/* Return true if X is a legitimate address with a 12-bit offset.
+   MODE is the mode of the value being accessed.  */
+
+bool
+loongarch_12bit_offset_address_p (rtx x, machine_mode mode)
+{
+  struct loongarch_address_info addr;
+
+  return (loongarch_classify_address (&addr, x, mode, false)
+	  && addr.type == ADDRESS_REG
+	  && CONST_INT_P (addr.offset)
+	  && LARCH_U12BIT_OFFSET_P (INTVAL (addr.offset)));
+}
+
+/* Return true if X is a legitimate address with a 14-bit offset shifted 2.
+   MODE is the mode of the value being accessed.  */
+
+bool
+loongarch_14bit_shifted_offset_address_p (rtx x, machine_mode mode)
+{
+  struct loongarch_address_info addr;
+
+  return (loongarch_classify_address (&addr, x, mode, false)
+	  && addr.type == ADDRESS_REG
+	  && CONST_INT_P (addr.offset)
+	  && LARCH_16BIT_OFFSET_P (INTVAL (addr.offset))
+	  && LARCH_SHIFT_2_OFFSET_P (INTVAL (addr.offset)));
+}
+
+/* Return the number of instructions needed to load constant X,
+   assuming that BASE_INSN_LENGTH is the length of one instruction.
+   Return 0 if X isn't a valid constant.  */
+
+int
+loongarch_const_insns (rtx x)
+{
+  enum loongarch_symbol_type symbol_type;
+  rtx offset;
+
+  switch (GET_CODE (x))
+    {
+    case CONST_INT:
+      return loongarch_integer_cost (INTVAL (x));
+
+    case CONST_VECTOR:
+      /* Fall through.  */
+    case CONST_DOUBLE:
+      /* Allow zeros for normal mode, where we can use $0.  */
+      return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
+
+    case CONST:
+      /* See if we can refer to X directly.  */
+      if (loongarch_symbolic_constant_p (x, SYMBOL_CONTEXT_LEA, &symbol_type))
+	return loongarch_symbol_insns (symbol_type, MAX_MACHINE_MODE);
+
+      /* Otherwise try splitting the constant into a base and offset.
+	 If the offset is a 12-bit value, we can load the base address
+	 into a register and then use ADDI.{W/D} to add in the offset.
+	 If the offset is larger, we can load the base and offset
+	 into separate registers and add them together with ADD.{W/D}.
+	 However, the latter is only possible before reload; during
+	 and after reload, we must have the option of forcing the
+	 constant into the pool instead.  */
+      split_const (x, &x, &offset);
+      if (offset != 0)
+	{
+	  int n = loongarch_const_insns (x);
+	  if (n != 0)
+	    {
+	      if (IMM12_INT (offset))
+		return n + 1;
+	      else if (!targetm.cannot_force_const_mem (GET_MODE (x), x))
+		return n + 1 + loongarch_integer_cost (INTVAL (offset));
+	    }
+	}
+      return 0;
+
+    case SYMBOL_REF:
+    case LABEL_REF:
+      return loongarch_symbol_insns (
+	loongarch_classify_symbol (x, SYMBOL_CONTEXT_LEA), MAX_MACHINE_MODE);
+
+    default:
+      return 0;
+    }
+}
+
+/* X is a doubleword constant that can be handled by splitting it into
+   two words and loading each word separately.  Return the number of
+   instructions required to do this, assuming that BASE_INSN_LENGTH
+   is the length of one instruction.  */
+
+int
+loongarch_split_const_insns (rtx x)
+{
+  unsigned int low, high;
+
+  low = loongarch_const_insns (loongarch_subword (x, false));
+  high = loongarch_const_insns (loongarch_subword (x, true));
+  gcc_assert (low > 0 && high > 0);
+  return low + high;
+}
+
+/* Return the number of instructions needed to implement INSN,
+   given that it loads from or stores to MEM.  Assume that
+   BASE_INSN_LENGTH is the length of one instruction.  */
+
+int
+loongarch_load_store_insns (rtx mem, rtx_insn *insn)
+{
+  machine_mode mode;
+  bool might_split_p;
+  rtx set;
+
+  gcc_assert (MEM_P (mem));
+  mode = GET_MODE (mem);
+
+  /* Try to prove that INSN does not need to be split.  */
+  might_split_p = GET_MODE_SIZE (mode) > UNITS_PER_WORD;
+  if (might_split_p)
+    {
+      set = single_set (insn);
+      if (set
+	  && !loongarch_split_move_insn_p (SET_DEST (set), SET_SRC (set),
+					   insn))
+	might_split_p = false;
+    }
+
+  return loongarch_address_insns (XEXP (mem, 0), mode, might_split_p);
+}
+
+/* Return the number of instructions needed for an integer division,
+   assuming that BASE_INSN_LENGTH is the length of one instruction.  */
+
+int
+loongarch_idiv_insns (machine_mode mode ATTRIBUTE_UNUSED)
+{
+  int count;
+
+  count = 1;
+  if (TARGET_CHECK_ZERO_DIV)
+    count += 2;
+
+  return count;
+}
+
+/* Emit an instruction of the form (set TARGET (CODE OP0 OP1)).  */
+
+void
+loongarch_emit_binary (enum rtx_code code, rtx target, rtx op0, rtx op1)
+{
+  emit_insn (gen_rtx_SET (target, gen_rtx_fmt_ee (code, GET_MODE (target),
+						  op0, op1)));
+}
+
+/* Compute (CODE OP0 OP1) and store the result in a new register
+   of mode MODE.  Return that new register.  */
+
+static rtx
+loongarch_force_binary (machine_mode mode, enum rtx_code code, rtx op0,
+			rtx op1)
+{
+  rtx reg;
+
+  reg = gen_reg_rtx (mode);
+  loongarch_emit_binary (code, reg, op0, op1);
+  return reg;
+}
+
+/* Copy VALUE to a register and return that register.  If new pseudos
+   are allowed, copy it into a new register, otherwise use DEST.  */
+
+static rtx
+loongarch_force_temporary (rtx dest, rtx value)
+{
+  if (can_create_pseudo_p ())
+    return force_reg (Pmode, value);
+  else
+    {
+      loongarch_emit_move (dest, value);
+      return dest;
+    }
+}
+
+/* Wrap symbol or label BASE in an UNSPEC address of type SYMBOL_TYPE,
+   then add CONST_INT OFFSET to the result.  */
+
+static rtx
+loongarch_unspec_address_offset (rtx base, rtx offset,
+				 enum loongarch_symbol_type symbol_type)
+{
+  base = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, base),
+			 UNSPEC_ADDRESS_FIRST + symbol_type);
+  if (offset != const0_rtx)
+    base = gen_rtx_PLUS (Pmode, base, offset);
+  return gen_rtx_CONST (Pmode, base);
+}
+
+/* Return an UNSPEC address with underlying address ADDRESS and symbol
+   type SYMBOL_TYPE.  */
+
+rtx
+loongarch_unspec_address (rtx address, enum loongarch_symbol_type symbol_type)
+{
+  rtx base, offset;
+
+  split_const (address, &base, &offset);
+  return loongarch_unspec_address_offset (base, offset, symbol_type);
+}
+
+/* If OP is an UNSPEC address, return the address to which it refers,
+   otherwise return OP itself.  */
+
+rtx
+loongarch_strip_unspec_address (rtx op)
+{
+  rtx base, offset;
+
+  split_const (op, &base, &offset);
+  if (UNSPEC_ADDRESS_P (base))
+    op = plus_constant (Pmode, UNSPEC_ADDRESS (base), INTVAL (offset));
+  return op;
+}
+
+/* Return a legitimate address for REG + OFFSET.  TEMP is as for
+   loongarch_force_temporary; it is only needed when OFFSET is not a
+   IMM12_OPERAND.  */
+
+static rtx
+loongarch_add_offset (rtx temp, rtx reg, HOST_WIDE_INT offset)
+{
+  if (!IMM12_OPERAND (offset))
+    {
+      rtx high;
+
+      /* Leave OFFSET as a 12-bit offset and put the excess in HIGH.
+	 The addition inside the macro CONST_HIGH_PART may cause an
+	 overflow, so we need to force a sign-extension check.  */
+      high = gen_int_mode (CONST_HIGH_PART (offset), Pmode);
+      offset = CONST_LOW_PART (offset);
+      high = loongarch_force_temporary (temp, high);
+      reg = loongarch_force_temporary (temp, gen_rtx_PLUS (Pmode, high, reg));
+    }
+  return plus_constant (Pmode, reg, offset);
+}
+
+/* The __tls_get_attr symbol.  */
+static GTY (()) rtx loongarch_tls_symbol;
+
+/* Load an entry from the GOT for a TLS GD access.  */
+
+static rtx
+loongarch_got_load_tls_gd (rtx dest, rtx sym)
+{
+  if (Pmode == DImode)
+    return gen_got_load_tls_gddi (dest, sym);
+  else
+    return gen_got_load_tls_gdsi (dest, sym);
+}
+
+/* Load an entry from the GOT for a TLS LD access.  */
+
+static rtx
+loongarch_got_load_tls_ld (rtx dest, rtx sym)
+{
+  if (Pmode == DImode)
+    return gen_got_load_tls_lddi (dest, sym);
+  else
+    return gen_got_load_tls_ldsi (dest, sym);
+}
+
+/* Load an entry from the GOT for a TLS IE access.  */
+
+static rtx
+loongarch_got_load_tls_ie (rtx dest, rtx sym)
+{
+  if (Pmode == DImode)
+    return gen_got_load_tls_iedi (dest, sym);
+  else
+    return gen_got_load_tls_iesi (dest, sym);
+}
+
+/* Add in the thread pointer for a TLS LE access.  */
+
+static rtx
+loongarch_got_load_tls_le (rtx dest, rtx sym)
+{
+  if (Pmode == DImode)
+    return gen_got_load_tls_ledi (dest, sym);
+  else
+    return gen_got_load_tls_lesi (dest, sym);
+}
+
+/* Return an instruction sequence that calls __tls_get_addr.  SYM is
+   the TLS symbol we are referencing and TYPE is the symbol type to use
+   (either global dynamic or local dynamic).  V0 is an RTX for the
+   return value location.  */
+
+static rtx_insn *
+loongarch_call_tls_get_addr (rtx sym, enum loongarch_symbol_type type, rtx v0)
+{
+  rtx loc, a0;
+  rtx_insn *insn;
+
+  a0 = gen_rtx_REG (Pmode, GP_ARG_FIRST);
+
+  if (!loongarch_tls_symbol)
+    loongarch_tls_symbol = init_one_libfunc ("__tls_get_addr");
+
+  loc = loongarch_unspec_address (sym, type);
+
+  start_sequence ();
+
+  if (type == SYMBOL_TLSLDM)
+    emit_insn (loongarch_got_load_tls_ld (a0, loc));
+  else if (type == SYMBOL_TLSGD)
+    emit_insn (loongarch_got_load_tls_gd (a0, loc));
+  else
+    gcc_unreachable ();
+
+  insn = emit_call_insn (gen_call_value_internal (v0, loongarch_tls_symbol,
+						  const0_rtx));
+  RTL_CONST_CALL_P (insn) = 1;
+  use_reg (&CALL_INSN_FUNCTION_USAGE (insn), a0);
+  insn = get_insns ();
+
+  end_sequence ();
+
+  return insn;
+}
+
+/* Generate the code to access LOC, a thread-local SYMBOL_REF, and return
+   its address.  The return value will be both a valid address and a valid
+   SET_SRC (either a REG or a LO_SUM).  */
+
+static rtx
+loongarch_legitimize_tls_address (rtx loc)
+{
+  rtx dest, tp, tmp;
+  enum tls_model model = SYMBOL_REF_TLS_MODEL (loc);
+  rtx_insn *insn;
+
+  switch (model)
+    {
+    case TLS_MODEL_LOCAL_DYNAMIC:
+      tmp = gen_rtx_REG (Pmode, GP_RETURN);
+      dest = gen_reg_rtx (Pmode);
+      insn = loongarch_call_tls_get_addr (loc, SYMBOL_TLSLDM, tmp);
+      emit_libcall_block (insn, dest, tmp, loc);
+      break;
+
+    case TLS_MODEL_GLOBAL_DYNAMIC:
+      tmp = gen_rtx_REG (Pmode, GP_RETURN);
+      dest = gen_reg_rtx (Pmode);
+      insn = loongarch_call_tls_get_addr (loc, SYMBOL_TLSGD, tmp);
+      emit_libcall_block (insn, dest, tmp, loc);
+      break;
+
+    case TLS_MODEL_INITIAL_EXEC:
+      /* la.tls.ie; tp-relative add  */
+      tp = gen_rtx_REG (Pmode, THREAD_POINTER_REGNUM);
+      tmp = gen_reg_rtx (Pmode);
+      emit_insn (loongarch_got_load_tls_ie (tmp, loc));
+      dest = gen_reg_rtx (Pmode);
+      emit_insn (gen_add3_insn (dest, tmp, tp));
+      break;
+
+    case TLS_MODEL_LOCAL_EXEC:
+      /* la.tls.le; tp-relative add  */
+      tp = gen_rtx_REG (Pmode, THREAD_POINTER_REGNUM);
+      tmp = gen_reg_rtx (Pmode);
+      emit_insn (loongarch_got_load_tls_le (tmp, loc));
+      dest = gen_reg_rtx (Pmode);
+      emit_insn (gen_add3_insn (dest, tmp, tp));
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+  return dest;
+}
+
+rtx
+loongarch_legitimize_call_address (rtx addr)
+{
+  if (!call_insn_operand (addr, VOIDmode))
+    {
+      rtx reg = gen_reg_rtx (Pmode);
+      loongarch_emit_move (reg, addr);
+      return reg;
+    }
+  return addr;
+}
+
+/* If X is a PLUS of a CONST_INT, return the two terms in *BASE_PTR
+   and *OFFSET_PTR.  Return X in *BASE_PTR and 0 in *OFFSET_PTR otherwise.  */
+
+static void
+loongarch_split_plus (rtx x, rtx *base_ptr, HOST_WIDE_INT *offset_ptr)
+{
+  if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x, 1)))
+    {
+      *base_ptr = XEXP (x, 0);
+      *offset_ptr = INTVAL (XEXP (x, 1));
+    }
+  else
+    {
+      *base_ptr = x;
+      *offset_ptr = 0;
+    }
+}
+
+/* If X is not a valid address for mode MODE, force it into a register.  */
+
+static rtx
+loongarch_force_address (rtx x, machine_mode mode)
+{
+  if (!loongarch_legitimate_address_p (mode, x, false))
+    x = force_reg (Pmode, x);
+  return x;
+}
+
+/* This function is used to implement LEGITIMIZE_ADDRESS.  If X can
+   be legitimized in a way that the generic machinery might not expect,
+   return a new address, otherwise return NULL.  MODE is the mode of
+   the memory being accessed.  */
+
+static rtx
+loongarch_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
+			      machine_mode mode)
+{
+  rtx base, addr;
+  HOST_WIDE_INT offset;
+
+  if (loongarch_tls_symbol_p (x))
+    return loongarch_legitimize_tls_address (x);
+
+  /* Handle BASE + OFFSET using loongarch_add_offset.  */
+  loongarch_split_plus (x, &base, &offset);
+  if (offset != 0)
+    {
+      if (!loongarch_valid_base_register_p (base, mode, false))
+	base = copy_to_mode_reg (Pmode, base);
+      addr = loongarch_add_offset (NULL, base, offset);
+      return loongarch_force_address (addr, mode);
+    }
+
+  return x;
+}
+
+/* Load VALUE into DEST.  TEMP is as for loongarch_force_temporary.  */
+
+void
+loongarch_move_integer (rtx temp, rtx dest, unsigned HOST_WIDE_INT value)
+{
+  struct loongarch_integer_op codes[LARCH_MAX_INTEGER_OPS];
+  machine_mode mode;
+  unsigned int i, num_ops;
+  rtx x;
+
+  mode = GET_MODE (dest);
+  num_ops = loongarch_build_integer (codes, value);
+
+  /* Apply each binary operation to X.  Invariant: X is a legitimate
+     source operand for a SET pattern.  */
+  x = GEN_INT (codes[0].value);
+  for (i = 1; i < num_ops; i++)
+    {
+      if (!can_create_pseudo_p ())
+	{
+	  emit_insn (gen_rtx_SET (temp, x));
+	  x = temp;
+	}
+      else
+	x = force_reg (mode, x);
+
+      switch (codes[i].method)
+	{
+	case METHOD_NORMAL:
+	  x = gen_rtx_fmt_ee (codes[i].code, mode, x,
+			      GEN_INT (codes[i].value));
+	  break;
+	case METHOD_LU32I:
+	  emit_insn (
+	    gen_rtx_SET (x,
+			 gen_rtx_IOR (DImode,
+				      gen_rtx_ZERO_EXTEND (
+					DImode, gen_rtx_SUBREG (SImode, x, 0)),
+				      GEN_INT (codes[i].value))));
+	  break;
+	case METHOD_LU52I:
+	  emit_insn (gen_lu52i_d (x, x, GEN_INT (0xfffffffffffff),
+				  GEN_INT (codes[i].value)));
+	  break;
+	case METHOD_INSV:
+	  emit_insn (
+	    gen_rtx_SET (gen_rtx_ZERO_EXTRACT (DImode, x, GEN_INT (20),
+					       GEN_INT (32)),
+			 gen_rtx_REG (DImode, 0)));
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+    }
+
+  emit_insn (gen_rtx_SET (dest, x));
+}
+
+/* Subroutine of loongarch_legitimize_move.  Move constant SRC into register
+   DEST given that SRC satisfies immediate_operand but doesn't satisfy
+   move_operand.  */
+
+static void
+loongarch_legitimize_const_move (machine_mode mode, rtx dest, rtx src)
+{
+  rtx base, offset;
+
+  /* Split moves of big integers into smaller pieces.  */
+  if (splittable_const_int_operand (src, mode))
+    {
+      loongarch_move_integer (dest, dest, INTVAL (src));
+      return;
+    }
+
+  /* Generate the appropriate access sequences for TLS symbols.  */
+  if (loongarch_tls_symbol_p (src))
+    {
+      loongarch_emit_move (dest, loongarch_legitimize_tls_address (src));
+      return;
+    }
+
+  /* If we have (const (plus symbol offset)), and that expression cannot
+     be forced into memory, load the symbol first and add in the offset.
+     prefer to do this even if the constant _can_ be forced into memory,
+     as it usually produces better code.  */
+  split_const (src, &base, &offset);
+  if (offset != const0_rtx
+      && (targetm.cannot_force_const_mem (mode, src)
+	  || (can_create_pseudo_p ())))
+    {
+      base = loongarch_force_temporary (dest, base);
+      loongarch_emit_move (dest,
+			   loongarch_add_offset (NULL, base, INTVAL (offset)));
+      return;
+    }
+
+  src = force_const_mem (mode, src);
+
+  loongarch_emit_move (dest, src);
+}
+
+/* If (set DEST SRC) is not a valid move instruction, emit an equivalent
+   sequence that is valid.  */
+
+bool
+loongarch_legitimize_move (machine_mode mode, rtx dest, rtx src)
+{
+  if (!register_operand (dest, mode) && !reg_or_0_operand (src, mode))
+    {
+      loongarch_emit_move (dest, force_reg (mode, src));
+      return true;
+    }
+
+  /* Both src and dest are non-registers;  one special case is supported where
+     the source is (const_int 0) and the store can source the zero register.
+     */
+  if (!register_operand (dest, mode) && !register_operand (src, mode)
+      && !const_0_operand (src, mode))
+    {
+      loongarch_emit_move (dest, force_reg (mode, src));
+      return true;
+    }
+
+  /* We need to deal with constants that would be legitimate
+     immediate_operands but aren't legitimate move_operands.  */
+  if (CONSTANT_P (src) && !move_operand (src, mode))
+    {
+      loongarch_legitimize_const_move (mode, dest, src);
+      set_unique_reg_note (get_last_insn (), REG_EQUAL, copy_rtx (src));
+      return true;
+    }
+
+  return false;
+}
+
+/* Return true if OP refers to small data symbols directly, not through
+   a LO_SUM.  CONTEXT is the context in which X appears.  */
+
+static int
+loongarch_small_data_pattern_1 (rtx x,
+				enum loongarch_symbol_context context \
+				ATTRIBUTE_UNUSED)
+{
+  subrtx_var_iterator::array_type array;
+  FOR_EACH_SUBRTX_VAR (iter, array, x, ALL)
+    {
+      rtx x = *iter;
+
+      /* We make no particular guarantee about which symbolic constants are
+	 acceptable as asm operands versus which must be forced into a GPR.  */
+      if (GET_CODE (x) == ASM_OPERANDS)
+	iter.skip_subrtxes ();
+      else if (MEM_P (x))
+	{
+	  if (loongarch_small_data_pattern_1 (XEXP (x, 0), SYMBOL_CONTEXT_MEM))
+	    return true;
+	  iter.skip_subrtxes ();
+	}
+    }
+  return false;
+}
+
+/* Return true if OP refers to small data symbols directly, not through
+   a LO_SUM.  */
+
+bool
+loongarch_small_data_pattern_p (rtx op)
+{
+  return loongarch_small_data_pattern_1 (op, SYMBOL_CONTEXT_LEA);
+}
+
+/* Rewrite *LOC so that it refers to small data using explicit
+   relocations.  CONTEXT is the context in which *LOC appears.  */
+
+static void
+loongarch_rewrite_small_data_1 (rtx *loc,
+				enum loongarch_symbol_context context \
+				ATTRIBUTE_UNUSED)
+{
+  subrtx_ptr_iterator::array_type array;
+  FOR_EACH_SUBRTX_PTR (iter, array, loc, ALL)
+    {
+      rtx *loc = *iter;
+      if (MEM_P (*loc))
+	{
+	  loongarch_rewrite_small_data_1 (&XEXP (*loc, 0), SYMBOL_CONTEXT_MEM);
+	  iter.skip_subrtxes ();
+	}
+    }
+}
+
+/* Rewrite instruction pattern PATTERN so that it refers to small data
+   using explicit relocations.  */
+
+rtx
+loongarch_rewrite_small_data (rtx pattern)
+{
+  pattern = copy_insn (pattern);
+  loongarch_rewrite_small_data_1 (&pattern, SYMBOL_CONTEXT_LEA);
+  return pattern;
+}
+
+/* The cost of loading values from the constant pool.  It should be
+   larger than the cost of any constant we want to synthesize inline.  */
+#define CONSTANT_POOL_COST COSTS_N_INSNS (8)
+
+/* Return true if there is a instruction that implements CODE
+   and if that instruction accepts X as an immediate operand.  */
+
+static int
+loongarch_immediate_operand_p (int code, HOST_WIDE_INT x)
+{
+  switch (code)
+    {
+    case ASHIFT:
+    case ASHIFTRT:
+    case LSHIFTRT:
+      /* All shift counts are truncated to a valid constant.  */
+      return true;
+
+    case ROTATE:
+    case ROTATERT:
+      /* Likewise rotates, if the target supports rotates at all.  */
+      return true;
+
+    case AND:
+    case IOR:
+    case XOR:
+      /* These instructions take 12-bit unsigned immediates.  */
+      return IMM12_OPERAND_UNSIGNED (x);
+
+    case PLUS:
+    case LT:
+    case LTU:
+      /* These instructions take 12-bit signed immediates.  */
+      return IMM12_OPERAND (x);
+
+    case EQ:
+    case NE:
+    case GT:
+    case GTU:
+      /* The "immediate" forms of these instructions are really
+	 implemented as comparisons with register 0.  */
+      return x == 0;
+
+    case GE:
+    case GEU:
+      /* Likewise, meaning that the only valid immediate operand is 1.  */
+      return x == 1;
+
+    case LE:
+      /* We add 1 to the immediate and use SLT.  */
+      return IMM12_OPERAND (x + 1);
+
+    case LEU:
+      /* Likewise SLTU, but reject the always-true case.  */
+      return IMM12_OPERAND (x + 1) && x + 1 != 0;
+
+    case SIGN_EXTRACT:
+    case ZERO_EXTRACT:
+      /* The bit position and size are immediate operands.  */
+      return 1;
+
+    default:
+      /* By default assume that $0 can be used for 0.  */
+      return x == 0;
+    }
+}
+
+/* Return the cost of binary operation X, given that the instruction
+   sequence for a word-sized or smaller operation has cost SINGLE_COST
+   and that the sequence of a double-word operation has cost DOUBLE_COST.
+   If SPEED is true, optimize for speed otherwise optimize for size.  */
+
+static int
+loongarch_binary_cost (rtx x, int single_cost, int double_cost, bool speed)
+{
+  int cost;
+
+  if (GET_MODE_SIZE (GET_MODE (x)) == UNITS_PER_WORD * 2)
+    cost = double_cost;
+  else
+    cost = single_cost;
+  return (cost
+	  + set_src_cost (XEXP (x, 0), GET_MODE (x), speed)
+	  + rtx_cost (XEXP (x, 1), GET_MODE (x), GET_CODE (x), 1, speed));
+}
+
+/* Return the cost of floating-point multiplications of mode MODE.  */
+
+static int
+loongarch_fp_mult_cost (machine_mode mode)
+{
+  return mode == DFmode ? loongarch_cost->fp_mult_df
+			: loongarch_cost->fp_mult_sf;
+}
+
+/* Return the cost of floating-point divisions of mode MODE.  */
+
+static int
+loongarch_fp_div_cost (machine_mode mode)
+{
+  return mode == DFmode ? loongarch_cost->fp_div_df
+			: loongarch_cost->fp_div_sf;
+}
+
+/* Return the cost of sign-extending OP to mode MODE, not including the
+   cost of OP itself.  */
+
+static int
+loongarch_sign_extend_cost (machine_mode mode, rtx op)
+{
+  if (MEM_P (op))
+    /* Extended loads are as cheap as unextended ones.  */
+    return 0;
+
+  if (TARGET_64BIT && mode == DImode && GET_MODE (op) == SImode)
+    /* A sign extension from SImode to DImode in 64-bit mode is free.  */
+    return 0;
+
+  return COSTS_N_INSNS (1);
+}
+
+/* Return the cost of zero-extending OP to mode MODE, not including the
+   cost of OP itself.  */
+
+static int
+loongarch_zero_extend_cost (machine_mode mode, rtx op)
+{
+  if (MEM_P (op))
+    /* Extended loads are as cheap as unextended ones.  */
+    return 0;
+
+  if (TARGET_64BIT && mode == DImode && GET_MODE (op) == SImode)
+    /* We need a shift left by 32 bits and a shift right by 32 bits.  */
+    return COSTS_N_INSNS (2);
+
+  /* We can use ANDI.  */
+  return COSTS_N_INSNS (1);
+}
+
+/* Return the cost of moving between two registers of mode MODE,
+   assuming that the move will be in pieces of at most UNITS bytes.  */
+
+static int
+loongarch_set_reg_reg_piece_cost (machine_mode mode, unsigned int units)
+{
+  return COSTS_N_INSNS ((GET_MODE_SIZE (mode) + units - 1) / units);
+}
+
+/* Return the cost of moving between two registers of mode MODE.  */
+
+static int
+loongarch_set_reg_reg_cost (machine_mode mode)
+{
+  switch (GET_MODE_CLASS (mode))
+    {
+    case MODE_CC:
+      return loongarch_set_reg_reg_piece_cost (mode, GET_MODE_SIZE (CCmode));
+
+    case MODE_FLOAT:
+    case MODE_COMPLEX_FLOAT:
+    case MODE_VECTOR_FLOAT:
+      if (TARGET_HARD_FLOAT)
+	return loongarch_set_reg_reg_piece_cost (mode, UNITS_PER_HWFPVALUE);
+      /* Fall through.  */
+
+    default:
+      return loongarch_set_reg_reg_piece_cost (mode, UNITS_PER_WORD);
+    }
+}
+
+/* Implement TARGET_RTX_COSTS.  */
+
+static bool
+loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code,
+		     int opno ATTRIBUTE_UNUSED, int *total, bool speed)
+{
+  int code = GET_CODE (x);
+  bool float_mode_p = FLOAT_MODE_P (mode);
+  int cost;
+  rtx addr;
+
+  if (outer_code == COMPARE)
+    {
+      gcc_assert (CONSTANT_P (x));
+      *total = 0;
+      return true;
+    }
+
+  switch (code)
+    {
+    case CONST_INT:
+      if (TARGET_64BIT && outer_code == AND && UINTVAL (x) == 0xffffffff)
+	{
+	  *total = 0;
+	  return true;
+	}
+
+      /* When not optimizing for size, we care more about the cost
+	 of hot code, and hot code is often in a loop.  If a constant
+	 operand needs to be forced into a register, we will often be
+	 able to hoist the constant load out of the loop, so the load
+	 should not contribute to the cost.  */
+      if (speed || loongarch_immediate_operand_p (outer_code, INTVAL (x)))
+	{
+	  *total = 0;
+	  return true;
+	}
+      /* Fall through.  */
+
+    case CONST:
+    case SYMBOL_REF:
+    case LABEL_REF:
+    case CONST_DOUBLE:
+      cost = loongarch_const_insns (x);
+      if (cost > 0)
+	{
+	  if (cost == 1 && outer_code == SET
+	      && !(float_mode_p && TARGET_HARD_FLOAT))
+	    cost = 0;
+	  else if ((outer_code == SET || GET_MODE (x) == VOIDmode))
+	    cost = 1;
+	  *total = COSTS_N_INSNS (cost);
+	  return true;
+	}
+      /* The value will need to be fetched from the constant pool.  */
+      *total = CONSTANT_POOL_COST;
+      return true;
+
+    case MEM:
+      /* If the address is legitimate, return the number of
+	 instructions it needs.  */
+      addr = XEXP (x, 0);
+      /* Check for a scaled indexed address.  */
+      if (loongarch_index_address_p (addr, mode))
+	{
+	  *total = COSTS_N_INSNS (2);
+	  return true;
+	}
+      cost = loongarch_address_insns (addr, mode, true);
+      if (cost > 0)
+	{
+	  *total = COSTS_N_INSNS (cost + 1);
+	  return true;
+	}
+      /* Otherwise use the default handling.  */
+      return false;
+
+    case FFS:
+      *total = COSTS_N_INSNS (6);
+      return false;
+
+    case NOT:
+      *total = COSTS_N_INSNS (GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 2 : 1);
+      return false;
+
+    case AND:
+      /* Check for a *clear_upper32 pattern and treat it like a zero
+	 extension.  See the pattern's comment for details.  */
+      if (TARGET_64BIT && mode == DImode && CONST_INT_P (XEXP (x, 1))
+	  && UINTVAL (XEXP (x, 1)) == 0xffffffff)
+	{
+	  *total = (loongarch_zero_extend_cost (mode, XEXP (x, 0))
+		    + set_src_cost (XEXP (x, 0), mode, speed));
+	  return true;
+	}
+      /* (AND (NOT op0) (NOT op1) is a nor operation that can be done in
+	 a single instruction.  */
+      if (GET_CODE (XEXP (x, 0)) == NOT && GET_CODE (XEXP (x, 1)) == NOT)
+	{
+	  cost = GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 2 : 1;
+	  *total = (COSTS_N_INSNS (cost)
+		    + set_src_cost (XEXP (XEXP (x, 0), 0), mode, speed)
+		    + set_src_cost (XEXP (XEXP (x, 1), 0), mode, speed));
+	  return true;
+	}
+
+      /* Fall through.  */
+
+    case IOR:
+    case XOR:
+      /* Double-word operations use two single-word operations.  */
+      *total = loongarch_binary_cost (x, COSTS_N_INSNS (1), COSTS_N_INSNS (2),
+				      speed);
+      return true;
+
+    case ASHIFT:
+    case ASHIFTRT:
+    case LSHIFTRT:
+    case ROTATE:
+    case ROTATERT:
+      if (CONSTANT_P (XEXP (x, 1)))
+	*total = loongarch_binary_cost (x, COSTS_N_INSNS (1),
+					COSTS_N_INSNS (4), speed);
+      else
+	*total = loongarch_binary_cost (x, COSTS_N_INSNS (1),
+					COSTS_N_INSNS (12), speed);
+      return true;
+
+    case ABS:
+      if (float_mode_p)
+	*total = loongarch_cost->fp_add;
+      else
+	*total = COSTS_N_INSNS (4);
+      return false;
+
+    case LT:
+    case LTU:
+    case LE:
+    case LEU:
+    case GT:
+    case GTU:
+    case GE:
+    case GEU:
+    case EQ:
+    case NE:
+    case UNORDERED:
+    case LTGT:
+    case UNGE:
+    case UNGT:
+    case UNLE:
+    case UNLT:
+      /* Branch comparisons have VOIDmode, so use the first operand's
+	 mode instead.  */
+      mode = GET_MODE (XEXP (x, 0));
+      if (FLOAT_MODE_P (mode))
+	{
+	  *total = loongarch_cost->fp_add;
+	  return false;
+	}
+      *total = loongarch_binary_cost (x, COSTS_N_INSNS (1), COSTS_N_INSNS (4),
+				      speed);
+      return true;
+
+    case MINUS:
+    case PLUS:
+      if (float_mode_p)
+	{
+	  *total = loongarch_cost->fp_add;
+	  return false;
+	}
+
+      /* If it's an add + mult (which is equivalent to shift left) and
+	 it's immediate operand satisfies const_immalsl_operand predicate.  */
+      if ((mode == SImode || (TARGET_64BIT && mode == DImode))
+	  && GET_CODE (XEXP (x, 0)) == MULT)
+	{
+	  rtx op2 = XEXP (XEXP (x, 0), 1);
+	  if (const_immalsl_operand (op2, mode))
+	    {
+	      *total = (COSTS_N_INSNS (1)
+			+ set_src_cost (XEXP (XEXP (x, 0), 0), mode, speed)
+			+ set_src_cost (XEXP (x, 1), mode, speed));
+	      return true;
+	    }
+	}
+
+      /* Double-word operations require three single-word operations and
+	 an SLTU.  */
+      *total = loongarch_binary_cost (x, COSTS_N_INSNS (1), COSTS_N_INSNS (4),
+				      speed);
+      return true;
+
+    case NEG:
+      if (float_mode_p)
+	*total = loongarch_cost->fp_add;
+      else
+	*total = COSTS_N_INSNS (GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 4 : 1);
+      return false;
+
+    case FMA:
+      *total = loongarch_fp_mult_cost (mode);
+      return false;
+
+    case MULT:
+      if (float_mode_p)
+	*total = loongarch_fp_mult_cost (mode);
+      else if (mode == DImode && !TARGET_64BIT)
+	*total = (speed
+		  ? loongarch_cost->int_mult_si * 3 + 6
+		  : COSTS_N_INSNS (7));
+      else if (!speed)
+	*total = COSTS_N_INSNS (1) + 1;
+      else if (mode == DImode)
+	*total = loongarch_cost->int_mult_di;
+      else
+	*total = loongarch_cost->int_mult_si;
+      return false;
+
+    case DIV:
+      /* Check for a reciprocal.  */
+      if (float_mode_p
+	  && flag_unsafe_math_optimizations
+	  && XEXP (x, 0) == CONST1_RTX (mode))
+	{
+	  if (outer_code == SQRT || GET_CODE (XEXP (x, 1)) == SQRT)
+	    /* An rsqrt<mode>a or rsqrt<mode>b pattern.  Count the
+	       division as being free.  */
+	    *total = set_src_cost (XEXP (x, 1), mode, speed);
+	  else
+	    *total = (loongarch_fp_div_cost (mode)
+		      + set_src_cost (XEXP (x, 1), mode, speed));
+	  return true;
+	}
+      /* Fall through.  */
+
+    case SQRT:
+    case MOD:
+      if (float_mode_p)
+	{
+	  *total = loongarch_fp_div_cost (mode);
+	  return false;
+	}
+      /* Fall through.  */
+
+    case UDIV:
+    case UMOD:
+      if (!speed)
+	{
+	  *total = COSTS_N_INSNS (loongarch_idiv_insns (mode));
+	}
+      else if (mode == DImode)
+	*total = loongarch_cost->int_div_di;
+      else
+	*total = loongarch_cost->int_div_si;
+      return false;
+
+    case SIGN_EXTEND:
+      *total = loongarch_sign_extend_cost (mode, XEXP (x, 0));
+      return false;
+
+    case ZERO_EXTEND:
+      *total = loongarch_zero_extend_cost (mode, XEXP (x, 0));
+      return false;
+    case TRUNCATE:
+      /* Costings for highpart multiplies.  Matching patterns of the form:
+
+	 (lshiftrt:DI (mult:DI (sign_extend:DI (...)
+			       (sign_extend:DI (...))
+		      (const_int 32)
+      */
+      if ((GET_CODE (XEXP (x, 0)) == ASHIFTRT
+	   || GET_CODE (XEXP (x, 0)) == LSHIFTRT)
+	  && CONST_INT_P (XEXP (XEXP (x, 0), 1))
+	  && ((INTVAL (XEXP (XEXP (x, 0), 1)) == 32
+	       && GET_MODE (XEXP (x, 0)) == DImode)
+	      || (TARGET_64BIT
+		  && INTVAL (XEXP (XEXP (x, 0), 1)) == 64
+		  && GET_MODE (XEXP (x, 0)) == TImode))
+	  && GET_CODE (XEXP (XEXP (x, 0), 0)) == MULT
+	  && ((GET_CODE (XEXP (XEXP (XEXP (x, 0), 0), 0)) == SIGN_EXTEND
+	       && GET_CODE (XEXP (XEXP (XEXP (x, 0), 0), 1)) == SIGN_EXTEND)
+	      || (GET_CODE (XEXP (XEXP (XEXP (x, 0), 0), 0)) == ZERO_EXTEND
+		  && (GET_CODE (XEXP (XEXP (XEXP (x, 0), 0), 1))
+		      == ZERO_EXTEND))))
+	{
+	  if (!speed)
+	    *total = COSTS_N_INSNS (1) + 1;
+	  else if (mode == DImode)
+	    *total = loongarch_cost->int_mult_di;
+	  else
+	    *total = loongarch_cost->int_mult_si;
+
+	  /* Sign extension is free, zero extension costs for DImode when
+	     on a 64bit core / when DMUL is present.  */
+	  for (int i = 0; i < 2; ++i)
+	    {
+	      rtx op = XEXP (XEXP (XEXP (x, 0), 0), i);
+	      if (TARGET_64BIT
+		  && GET_CODE (op) == ZERO_EXTEND
+		  && GET_MODE (op) == DImode)
+		*total += rtx_cost (op, DImode, MULT, i, speed);
+	      else
+		*total += rtx_cost (XEXP (op, 0), VOIDmode, GET_CODE (op), 0,
+				    speed);
+	    }
+
+	  return true;
+	}
+      return false;
+
+    case FLOAT:
+    case UNSIGNED_FLOAT:
+    case FIX:
+    case FLOAT_EXTEND:
+    case FLOAT_TRUNCATE:
+      *total = loongarch_cost->fp_add;
+      return false;
+
+    case SET:
+      if (register_operand (SET_DEST (x), VOIDmode)
+	  && reg_or_0_operand (SET_SRC (x), VOIDmode))
+	{
+	  *total = loongarch_set_reg_reg_cost (GET_MODE (SET_DEST (x)));
+	  return true;
+	}
+      return false;
+
+    default:
+      return false;
+    }
+}
+
+/* Implement TARGET_ADDRESS_COST.  */
+
+static int
+loongarch_address_cost (rtx addr, machine_mode mode,
+			addr_space_t as ATTRIBUTE_UNUSED,
+			bool speed ATTRIBUTE_UNUSED)
+{
+  return loongarch_address_insns (addr, mode, false);
+}
+
+/* Return one word of double-word value OP, taking into account the fixed
+   endianness of certain registers.  HIGH_P is true to select the high part,
+   false to select the low part.  */
+
+rtx
+loongarch_subword (rtx op, bool high_p)
+{
+  unsigned int byte, offset;
+  machine_mode mode;
+
+  mode = GET_MODE (op);
+  if (mode == VOIDmode)
+    mode = TARGET_64BIT ? TImode : DImode;
+
+  if (high_p)
+    byte = UNITS_PER_WORD;
+  else
+    byte = 0;
+
+  if (FP_REG_RTX_P (op))
+    {
+      /* Paired FPRs are always ordered little-endian.  */
+      offset = (UNITS_PER_WORD < UNITS_PER_HWFPVALUE ? high_p : byte != 0);
+      return gen_rtx_REG (word_mode, REGNO (op) + offset);
+    }
+
+  if (MEM_P (op))
+    return loongarch_rewrite_small_data (adjust_address (op, word_mode, byte));
+
+  return simplify_gen_subreg (word_mode, op, mode, byte);
+}
+
+/* Return true if a move from SRC to DEST should be split into two.
+   SPLIT_TYPE describes the split condition.  */
+
+bool
+loongarch_split_move_p (rtx dest, rtx src,
+			enum loongarch_split_type split_type ATTRIBUTE_UNUSED)
+{
+  /* FPR-to-FPR moves can be done in a single instruction, if they're
+     allowed at all.  */
+  unsigned int size = GET_MODE_SIZE (GET_MODE (dest));
+  if (size == 8 && FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
+    return false;
+
+  /* Check for floating-point loads and stores.  */
+  if (size == 8)
+    {
+      if (FP_REG_RTX_P (dest) && MEM_P (src))
+	return false;
+      if (FP_REG_RTX_P (src) && MEM_P (dest))
+	return false;
+    }
+  /* Otherwise split all multiword moves.  */
+  return size > UNITS_PER_WORD;
+}
+
+/* Split a move from SRC to DEST, given that loongarch_split_move_p holds.
+   SPLIT_TYPE describes the split condition.  */
+
+void
+loongarch_split_move (rtx dest, rtx src, enum loongarch_split_type split_type,
+		      rtx insn_)
+{
+  rtx low_dest;
+
+  gcc_checking_assert (loongarch_split_move_p (dest, src, split_type));
+  if (FP_REG_RTX_P (dest) || FP_REG_RTX_P (src))
+    {
+      if (!TARGET_64BIT && GET_MODE (dest) == DImode)
+	emit_insn (gen_move_doubleword_fprdi (dest, src));
+      else if (!TARGET_64BIT && GET_MODE (dest) == DFmode)
+	emit_insn (gen_move_doubleword_fprdf (dest, src));
+      else if (TARGET_64BIT && GET_MODE (dest) == TFmode)
+	emit_insn (gen_move_doubleword_fprtf (dest, src));
+      else
+	gcc_unreachable ();
+    }
+  else
+    {
+      /* The operation can be split into two normal moves.  Decide in
+	 which order to do them.  */
+      low_dest = loongarch_subword (dest, false);
+      if (REG_P (low_dest) && reg_overlap_mentioned_p (low_dest, src))
+	{
+	  loongarch_emit_move (loongarch_subword (dest, true),
+			       loongarch_subword (src, true));
+	  loongarch_emit_move (low_dest, loongarch_subword (src, false));
+	}
+      else
+	{
+	  loongarch_emit_move (low_dest, loongarch_subword (src, false));
+	  loongarch_emit_move (loongarch_subword (dest, true),
+			       loongarch_subword (src, true));
+	}
+    }
+
+  /* This is a hack.  See if the next insn uses DEST and if so, see if we
+     can forward SRC for DEST.  This is most useful if the next insn is a
+     simple store.  */
+  rtx_insn *insn = (rtx_insn *) insn_;
+  struct loongarch_address_info addr = {};
+  if (insn)
+    {
+      rtx_insn *next = next_nonnote_nondebug_insn_bb (insn);
+      if (next)
+	{
+	  rtx set = single_set (next);
+	  if (set && SET_SRC (set) == dest)
+	    {
+	      if (MEM_P (src))
+		{
+		  rtx tmp = XEXP (src, 0);
+		  loongarch_classify_address (&addr, tmp, GET_MODE (tmp),
+					      true);
+		  if (addr.reg && !reg_overlap_mentioned_p (dest, addr.reg))
+		    validate_change (next, &SET_SRC (set), src, false);
+		}
+	      else
+		validate_change (next, &SET_SRC (set), src, false);
+	    }
+	}
+    }
+}
+
+/* Return the split type for instruction INSN.  */
+
+static enum loongarch_split_type
+loongarch_insn_split_type (rtx insn)
+{
+  basic_block bb = BLOCK_FOR_INSN (insn);
+  if (bb)
+    {
+      if (optimize_bb_for_speed_p (bb))
+	return SPLIT_FOR_SPEED;
+      else
+	return SPLIT_FOR_SIZE;
+    }
+  /* Once CFG information has been removed, we should trust the optimization
+     decisions made by previous passes and only split where necessary.  */
+  return SPLIT_IF_NECESSARY;
+}
+
+/* Return true if a move from SRC to DEST in INSN should be split.  */
+
+bool
+loongarch_split_move_insn_p (rtx dest, rtx src, rtx insn)
+{
+  return loongarch_split_move_p (dest, src, loongarch_insn_split_type (insn));
+}
+
+/* Split a move from SRC to DEST in INSN, given that
+   loongarch_split_move_insn_p holds.  */
+
+void
+loongarch_split_move_insn (rtx dest, rtx src, rtx insn)
+{
+  loongarch_split_move (dest, src, loongarch_insn_split_type (insn), insn);
+}
+
+/* Forward declaration.  Used below.  */
+static HOST_WIDE_INT
+loongarch_constant_alignment (const_tree exp, HOST_WIDE_INT align);
+
+const char *
+loongarch_output_move_index (rtx x, machine_mode mode, bool ldr)
+{
+  int index = exact_log2 (GET_MODE_SIZE (mode));
+  if (!IN_RANGE (index, 0, 3))
+    return NULL;
+
+  struct loongarch_address_info info;
+  if ((loongarch_classify_address (&info, x, mode, false)
+       && !(info.type == ADDRESS_REG_REG))
+      || !loongarch_legitimate_address_p (mode, x, false))
+    return NULL;
+
+  const char *const insn[][4] =
+    {
+      {
+	"stx.b\t%z1,%0",
+	"stx.h\t%z1,%0",
+	"stx.w\t%z1,%0",
+	"stx.d\t%z1,%0",
+      },
+      {
+	"ldx.bu\t%0,%1",
+	"ldx.hu\t%0,%1",
+	"ldx.w\t%0,%1",
+	"ldx.d\t%0,%1",
+      }
+    };
+
+  return insn[ldr][index];
+}
+
+const char *
+loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr)
+{
+  int index = exact_log2 (GET_MODE_SIZE (mode));
+  if (!IN_RANGE (index, 2, 3))
+    return NULL;
+
+  struct loongarch_address_info info;
+  if ((loongarch_classify_address (&info, x, mode, false)
+       && !(info.type == ADDRESS_REG_REG))
+      || !loongarch_legitimate_address_p (mode, x, false))
+    return NULL;
+
+  const char *const insn[][2] =
+    {
+	{
+	  "fstx.s\t%1,%0",
+	  "fstx.d\t%1,%0"
+	},
+	{
+	  "fldx.s\t%0,%1",
+	  "fldx.d\t%0,%1"
+	},
+    };
+
+  return insn[ldr][index-2];
+}
+/* Return the appropriate instructions to move SRC into DEST.  Assume
+   that SRC is operand 1 and DEST is operand 0.  */
+
+const char *
+loongarch_output_move (rtx dest, rtx src)
+{
+  enum rtx_code dest_code = GET_CODE (dest);
+  enum rtx_code src_code = GET_CODE (src);
+  machine_mode mode = GET_MODE (dest);
+  bool dbl_p = (GET_MODE_SIZE (mode) == 8);
+
+  if (loongarch_split_move_p (dest, src, SPLIT_IF_NECESSARY))
+    return "#";
+
+  if ((src_code == REG && GP_REG_P (REGNO (src)))
+      || (src == CONST0_RTX (mode)))
+    {
+      if (dest_code == REG)
+	{
+	  if (GP_REG_P (REGNO (dest)))
+	    return "or\t%0,%z1,$r0";
+
+	  if (FP_REG_P (REGNO (dest)))
+	    return dbl_p ? "movgr2fr.d\t%0,%z1" : "movgr2fr.w\t%0,%z1";
+	}
+      if (dest_code == MEM)
+	{
+	  const char *insn = NULL;
+	  insn = loongarch_output_move_index (XEXP (dest, 0), GET_MODE (dest),
+					      false);
+	  if (insn)
+	    return insn;
+
+	  rtx offset = XEXP (dest, 0);
+	  if (GET_CODE (offset) == PLUS)
+	    offset = XEXP (offset, 1);
+	  switch (GET_MODE_SIZE (mode))
+	    {
+	    case 1:
+	      return "st.b\t%z1,%0";
+	    case 2:
+	      return "st.h\t%z1,%0";
+	    case 4:
+	      if (const_arith_operand (offset, Pmode))
+		return "st.w\t%z1,%0";
+	      else
+		return "stptr.w\t%z1,%0";
+	    case 8:
+	      if (const_arith_operand (offset, Pmode))
+		return "st.d\t%z1,%0";
+	      else
+		return "stptr.d\t%z1,%0";
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+    }
+  if (dest_code == REG && GP_REG_P (REGNO (dest)))
+    {
+      if (src_code == REG)
+	if (FP_REG_P (REGNO (src)))
+	  return dbl_p ? "movfr2gr.d\t%0,%1" : "movfr2gr.s\t%0,%1";
+
+      if (src_code == MEM)
+	{
+	  const char *insn = NULL;
+	  insn = loongarch_output_move_index (XEXP (src, 0), GET_MODE (src),
+					      true);
+	  if (insn)
+	    return insn;
+
+	  rtx offset = XEXP (src, 0);
+	  if (GET_CODE (offset) == PLUS)
+	    offset = XEXP (offset, 1);
+	  switch (GET_MODE_SIZE (mode))
+	    {
+	    case 1:
+	      return "ld.bu\t%0,%1";
+	    case 2:
+	      return "ld.hu\t%0,%1";
+	    case 4:
+	      if (const_arith_operand (offset, Pmode))
+		return "ld.w\t%0,%1";
+	      else
+		return "ldptr.w\t%0,%1";
+	    case 8:
+	      if (const_arith_operand (offset, Pmode))
+		return "ld.d\t%0,%1";
+	      else
+		return "ldptr.d\t%0,%1";
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+
+      if (src_code == CONST_INT)
+	{
+	  if (LU12I_INT (src))
+	    return "lu12i.w\t%0,%1>>12\t\t\t# %X1";
+	  else if (IMM12_INT (src))
+	    return "addi.w\t%0,$r0,%1\t\t\t# %X1";
+	  else if (IMM12_INT_UNSIGNED (src))
+	    return "ori\t%0,$r0,%1\t\t\t# %X1";
+	  else if (LU52I_INT (src))
+	    return "lu52i.d\t%0,$r0,%X1>>52\t\t\t# %1";
+	  else
+	    gcc_unreachable ();
+	}
+
+      if (symbolic_operand (src, VOIDmode))
+	{
+	  if ((TARGET_CMODEL_TINY && (!loongarch_global_symbol_p (src)
+				      || loongarch_symbol_binds_local_p (src)))
+	      || (TARGET_CMODEL_TINY_STATIC && !loongarch_weak_symbol_p (src)))
+	    {
+	      /* The symbol must be aligned to 4 byte.  */
+	      unsigned int align;
+
+	      if (GET_CODE (src) == LABEL_REF)
+		align = 128 /* Whatever.  */;
+	      else if (CONSTANT_POOL_ADDRESS_P (src))
+		align = GET_MODE_ALIGNMENT (get_pool_mode (src));
+	      else if (TREE_CONSTANT_POOL_ADDRESS_P (src))
+		{
+		  tree exp = SYMBOL_REF_DECL (src);
+		  align = TYPE_ALIGN (TREE_TYPE (exp));
+		  align = loongarch_constant_alignment (exp, align);
+		}
+	      else if (SYMBOL_REF_DECL (src))
+		align = DECL_ALIGN (SYMBOL_REF_DECL (src));
+	      else if (SYMBOL_REF_HAS_BLOCK_INFO_P (src)
+		       && SYMBOL_REF_BLOCK (src) != NULL)
+		align = SYMBOL_REF_BLOCK (src)->alignment;
+	      else
+		align = BITS_PER_UNIT;
+
+	      if (align % (4 * 8) == 0)
+		return "pcaddi\t%0,%%pcrel(%1)>>2";
+	    }
+	  if (TARGET_CMODEL_TINY
+	      || TARGET_CMODEL_TINY_STATIC
+	      || TARGET_CMODEL_NORMAL
+	      || TARGET_CMODEL_LARGE)
+	    {
+	      if (!loongarch_global_symbol_p (src)
+		  || loongarch_symbol_binds_local_p (src))
+		return "la.local\t%0,%1";
+	      else
+		return "la.global\t%0,%1";
+	    }
+	  if (TARGET_CMODEL_EXTREME)
+	    {
+	      sorry ("Normal symbol loading not implemented in extreme mode.");
+	      /* GCC complains.  */
+	      /* return ""; */
+	    }
+	}
+    }
+  if (src_code == REG && FP_REG_P (REGNO (src)))
+    {
+      if (dest_code == REG && FP_REG_P (REGNO (dest)))
+	return dbl_p ? "fmov.d\t%0,%1" : "fmov.s\t%0,%1";
+
+      if (dest_code == MEM)
+	{
+	  const char *insn = NULL;
+	  insn = loongarch_output_move_index_float (XEXP (dest, 0),
+						    GET_MODE (dest),
+						    false);
+	  if (insn)
+	    return insn;
+
+	  return dbl_p ? "fst.d\t%1,%0" : "fst.s\t%1,%0";
+	}
+    }
+  if (dest_code == REG && FP_REG_P (REGNO (dest)))
+    {
+      if (src_code == MEM)
+	{
+	  const char *insn = NULL;
+	  insn = loongarch_output_move_index_float (XEXP (src, 0),
+						    GET_MODE (src),
+						    true);
+	  if (insn)
+	    return insn;
+
+	  return dbl_p ? "fld.d\t%0,%1" : "fld.s\t%0,%1";
+	}
+    }
+  gcc_unreachable ();
+}
+
+/* Return true if CMP1 is a suitable second operand for integer ordering
+   test CODE.  */
+
+static bool
+loongarch_int_order_operand_ok_p (enum rtx_code code, rtx cmp1)
+{
+  switch (code)
+    {
+    case GT:
+    case GTU:
+      return reg_or_0_operand (cmp1, VOIDmode);
+
+    case GE:
+    case GEU:
+      return cmp1 == const1_rtx;
+
+    case LT:
+    case LTU:
+      return arith_operand (cmp1, VOIDmode);
+
+    case LE:
+      return sle_operand (cmp1, VOIDmode);
+
+    case LEU:
+      return sleu_operand (cmp1, VOIDmode);
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Return true if *CMP1 (of mode MODE) is a valid second operand for
+   integer ordering test *CODE, or if an equivalent combination can
+   be formed by adjusting *CODE and *CMP1.  When returning true, update
+   *CODE and *CMP1 with the chosen code and operand, otherwise leave
+   them alone.  */
+
+static bool
+loongarch_canonicalize_int_order_test (enum rtx_code *code, rtx *cmp1,
+				       machine_mode mode)
+{
+  HOST_WIDE_INT plus_one;
+
+  if (loongarch_int_order_operand_ok_p (*code, *cmp1))
+    return true;
+
+  if (CONST_INT_P (*cmp1))
+    switch (*code)
+      {
+      case LE:
+	plus_one = trunc_int_for_mode (UINTVAL (*cmp1) + 1, mode);
+	if (INTVAL (*cmp1) < plus_one)
+	  {
+	    *code = LT;
+	    *cmp1 = force_reg (mode, GEN_INT (plus_one));
+	    return true;
+	  }
+	break;
+
+      case LEU:
+	plus_one = trunc_int_for_mode (UINTVAL (*cmp1) + 1, mode);
+	if (plus_one != 0)
+	  {
+	    *code = LTU;
+	    *cmp1 = force_reg (mode, GEN_INT (plus_one));
+	    return true;
+	  }
+	break;
+
+      default:
+	break;
+      }
+  return false;
+}
+
+/* Compare CMP0 and CMP1 using ordering test CODE and store the result
+   in TARGET.  CMP0 and TARGET are register_operands.  If INVERT_PTR
+   is nonnull, it's OK to set TARGET to the inverse of the result and
+   flip *INVERT_PTR instead.  */
+
+static void
+loongarch_emit_int_order_test (enum rtx_code code, bool *invert_ptr,
+			       rtx target, rtx cmp0, rtx cmp1)
+{
+  machine_mode mode;
+
+  /* First see if there is a LoongArch instruction that can do this operation.
+     If not, try doing the same for the inverse operation.  If that also
+     fails, force CMP1 into a register and try again.  */
+  mode = GET_MODE (cmp0);
+  if (loongarch_canonicalize_int_order_test (&code, &cmp1, mode))
+    loongarch_emit_binary (code, target, cmp0, cmp1);
+  else
+    {
+      enum rtx_code inv_code = reverse_condition (code);
+      if (!loongarch_canonicalize_int_order_test (&inv_code, &cmp1, mode))
+	{
+	  cmp1 = force_reg (mode, cmp1);
+	  loongarch_emit_int_order_test (code, invert_ptr, target, cmp0, cmp1);
+	}
+      else if (invert_ptr == 0)
+	{
+	  rtx inv_target;
+
+	  inv_target = loongarch_force_binary (GET_MODE (target),
+					       inv_code, cmp0, cmp1);
+	  loongarch_emit_binary (XOR, target, inv_target, const1_rtx);
+	}
+      else
+	{
+	  *invert_ptr = !*invert_ptr;
+	  loongarch_emit_binary (inv_code, target, cmp0, cmp1);
+	}
+    }
+}
+
+/* Return a register that is zero if CMP0 and CMP1 are equal.
+   The register will have the same mode as CMP0.  */
+
+static rtx
+loongarch_zero_if_equal (rtx cmp0, rtx cmp1)
+{
+  if (cmp1 == const0_rtx)
+    return cmp0;
+
+  if (uns_arith_operand (cmp1, VOIDmode))
+    return expand_binop (GET_MODE (cmp0), xor_optab, cmp0, cmp1, 0, 0,
+			 OPTAB_DIRECT);
+
+  return expand_binop (GET_MODE (cmp0), sub_optab, cmp0, cmp1, 0, 0,
+		       OPTAB_DIRECT);
+}
+
+/* Allocate a floating-point condition-code register of mode MODE.  */
+
+static rtx
+loongarch_allocate_fcc (machine_mode mode)
+{
+  unsigned int regno, count;
+
+  gcc_assert (TARGET_HARD_FLOAT);
+
+  if (mode == FCCmode)
+    count = 1;
+  else
+    gcc_unreachable ();
+
+  cfun->machine->next_fcc += -cfun->machine->next_fcc & (count - 1);
+  if (cfun->machine->next_fcc > FCC_REG_LAST - FCC_REG_FIRST)
+    cfun->machine->next_fcc = 0;
+
+  regno = FCC_REG_FIRST + cfun->machine->next_fcc;
+  cfun->machine->next_fcc += count;
+  return gen_rtx_REG (mode, regno);
+}
+
+/* Sign- or zero-extend OP0 and OP1 for integer comparisons.  */
+
+static void
+loongarch_extend_comparands (rtx_code code, rtx *op0, rtx *op1)
+{
+  /* Comparisons consider all XLEN bits, so extend sub-XLEN values.  */
+  if (GET_MODE_SIZE (word_mode) > GET_MODE_SIZE (GET_MODE (*op0)))
+    {
+      /* TODO: checkout It is more profitable to zero-extend QImode values.  */
+      if (unsigned_condition (code) == code && GET_MODE (*op0) == QImode)
+	{
+	  *op0 = gen_rtx_ZERO_EXTEND (word_mode, *op0);
+	  if (CONST_INT_P (*op1))
+	    *op1 = GEN_INT ((uint8_t) INTVAL (*op1));
+	  else
+	    *op1 = gen_rtx_ZERO_EXTEND (word_mode, *op1);
+	}
+      else
+	{
+	  *op0 = gen_rtx_SIGN_EXTEND (word_mode, *op0);
+	  if (*op1 != const0_rtx)
+	    *op1 = gen_rtx_SIGN_EXTEND (word_mode, *op1);
+	}
+    }
+}
+
+/* Convert a comparison into something that can be used in a branch.  On
+   entry, *OP0 and *OP1 are the values being compared and *CODE is the code
+   used to compare them.  Update them to describe the final comparison.  */
+
+static void
+loongarch_emit_int_compare (enum rtx_code *code, rtx *op0, rtx *op1)
+{
+  static const enum rtx_code
+  mag_comparisons[][2] = {{LEU, LTU}, {GTU, GEU}, {LE, LT}, {GT, GE}};
+
+  if (splittable_const_int_operand (*op1, VOIDmode))
+    {
+      HOST_WIDE_INT rhs = INTVAL (*op1);
+
+      if (*code == EQ || *code == NE)
+	{
+	  /* Convert e.g. OP0 == 2048 into OP0 - 2048 == 0.  */
+	  if (IMM12_OPERAND (-rhs))
+	    {
+	      *op0 = loongarch_force_binary (GET_MODE (*op0), PLUS, *op0,
+					     GEN_INT (-rhs));
+	      *op1 = const0_rtx;
+	    }
+	}
+      else
+	{
+	  /* Convert e.g. (OP0 <= 0xFFF) into (OP0 < 0x1000).  */
+	  for (size_t i = 0; i < ARRAY_SIZE (mag_comparisons); i++)
+	    {
+	      HOST_WIDE_INT new_rhs;
+	      bool increment = *code == mag_comparisons[i][0];
+	      bool decrement = *code == mag_comparisons[i][1];
+	      if (!increment && !decrement)
+		continue;
+
+	      new_rhs = rhs + (increment ? 1 : -1);
+	      if (loongarch_integer_cost (new_rhs)
+		    < loongarch_integer_cost (rhs)
+		  && (rhs < 0) == (new_rhs < 0))
+		{
+		  *op1 = GEN_INT (new_rhs);
+		  *code = mag_comparisons[i][increment];
+		}
+	      break;
+	    }
+	}
+    }
+
+  loongarch_extend_comparands (*code, op0, op1);
+
+  *op0 = force_reg (word_mode, *op0);
+  if (*op1 != const0_rtx)
+    *op1 = force_reg (word_mode, *op1);
+}
+
+/* Like riscv_emit_int_compare, but for floating-point comparisons.  */
+
+static void
+loongarch_emit_float_compare (enum rtx_code *code, rtx *op0, rtx *op1)
+{
+  rtx cmp_op0 = *op0;
+  rtx cmp_op1 = *op1;
+
+  /* Floating-point tests use a separate FCMP.cond.fmt
+     comparison to set a register.  The branch or conditional move will
+     then compare that register against zero.
+
+     Set CMP_CODE to the code of the comparison instruction and
+     *CODE to the code that the branch or move should use.  */
+  enum rtx_code cmp_code = *code;
+  /* Three FP conditions cannot be implemented by reversing the
+     operands for FCMP.cond.fmt, instead a reversed condition code is
+     required and a test for false.  */
+  *code = NE;
+  *op0 = loongarch_allocate_fcc (FCCmode);
+
+  *op1 = const0_rtx;
+  loongarch_emit_binary (cmp_code, *op0, cmp_op0, cmp_op1);
+}
+
+/* Try performing the comparison in OPERANDS[1], whose arms are OPERANDS[2]
+   and OPERAND[3].  Store the result in OPERANDS[0].
+
+   On 64-bit targets, the mode of the comparison and target will always be
+   SImode, thus possibly narrower than that of the comparison's operands.  */
+
+void
+loongarch_expand_scc (rtx operands[])
+{
+  rtx target = operands[0];
+  enum rtx_code code = GET_CODE (operands[1]);
+  rtx op0 = operands[2];
+  rtx op1 = operands[3];
+
+  gcc_assert (GET_MODE_CLASS (GET_MODE (op0)) == MODE_INT);
+
+  if (code == EQ || code == NE)
+    {
+      rtx zie = loongarch_zero_if_equal (op0, op1);
+      loongarch_emit_binary (code, target, zie, const0_rtx);
+    }
+  else
+    loongarch_emit_int_order_test (code, 0, target, op0, op1);
+}
+
+/* Compare OPERANDS[1] with OPERANDS[2] using comparison code
+   CODE and jump to OPERANDS[3] if the condition holds.  */
+
+void
+loongarch_expand_conditional_branch (rtx *operands)
+{
+  enum rtx_code code = GET_CODE (operands[0]);
+  rtx op0 = operands[1];
+  rtx op1 = operands[2];
+  rtx condition;
+
+  if (FLOAT_MODE_P (GET_MODE (op1)))
+    loongarch_emit_float_compare (&code, &op0, &op1);
+  else
+    loongarch_emit_int_compare (&code, &op0, &op1);
+
+  condition = gen_rtx_fmt_ee (code, VOIDmode, op0, op1);
+  emit_jump_insn (gen_condjump (condition, operands[3]));
+}
+
+/* Perform the comparison in OPERANDS[1].  Move OPERANDS[2] into OPERANDS[0]
+   if the condition holds, otherwise move OPERANDS[3] into OPERANDS[0].  */
+
+void
+loongarch_expand_conditional_move (rtx *operands)
+{
+  enum rtx_code code = GET_CODE (operands[1]);
+  rtx op0 = XEXP (operands[1], 0);
+  rtx op1 = XEXP (operands[1], 1);
+
+  if (FLOAT_MODE_P (GET_MODE (op1)))
+    loongarch_emit_float_compare (&code, &op0, &op1);
+  else
+    {
+      loongarch_extend_comparands (code, &op0, &op1);
+
+      op0 = force_reg (word_mode, op0);
+
+      if (code == EQ || code == NE)
+	{
+	  op0 = loongarch_zero_if_equal (op0, op1);
+	  op1 = const0_rtx;
+	}
+      else
+	{
+	  /* The comparison needs a separate scc instruction.  Store the
+	     result of the scc in *OP0 and compare it against zero.  */
+	  bool invert = false;
+	  rtx target = gen_reg_rtx (GET_MODE (op0));
+	  loongarch_emit_int_order_test (code, &invert, target, op0, op1);
+	  code = invert ? EQ : NE;
+	  op0 = target;
+	  op1 = const0_rtx;
+	}
+    }
+
+  rtx cond = gen_rtx_fmt_ee (code, GET_MODE (op0), op0, op1);
+  /* There is no direct support for general conditional GP move involving
+     two registers using SEL.  */
+  if (INTEGRAL_MODE_P (GET_MODE (operands[2]))
+      && register_operand (operands[2], VOIDmode)
+      && register_operand (operands[3], VOIDmode))
+    {
+      machine_mode mode = GET_MODE (operands[0]);
+      rtx temp = gen_reg_rtx (mode);
+      rtx temp2 = gen_reg_rtx (mode);
+
+      emit_insn (gen_rtx_SET (temp,
+			      gen_rtx_IF_THEN_ELSE (mode, cond,
+						    operands[2], const0_rtx)));
+
+      /* Flip the test for the second operand.  */
+      cond = gen_rtx_fmt_ee ((code == EQ) ? NE : EQ, GET_MODE (op0), op0, op1);
+
+      emit_insn (gen_rtx_SET (temp2,
+			      gen_rtx_IF_THEN_ELSE (mode, cond,
+						    operands[3], const0_rtx)));
+
+      /* Merge the two results, at least one is guaranteed to be zero.  */
+      emit_insn (gen_rtx_SET (operands[0], gen_rtx_IOR (mode, temp, temp2)));
+    }
+  else
+    emit_insn (gen_rtx_SET (operands[0],
+			    gen_rtx_IF_THEN_ELSE (GET_MODE (operands[0]), cond,
+						  operands[2], operands[3])));
+}
+
+/* Implement TARGET_EXPAND_BUILTIN_VA_START.  */
+
+static void
+loongarch_va_start (tree valist, rtx nextarg)
+{
+  nextarg = plus_constant (Pmode, nextarg, -cfun->machine->varargs_size);
+  std_expand_builtin_va_start (valist, nextarg);
+}
+
+/* Start a definition of function NAME.  */
+
+static void
+loongarch_start_function_definition (const char *name)
+{
+  ASM_OUTPUT_TYPE_DIRECTIVE (asm_out_file, name, "function");
+
+  /* Start the definition proper.  */
+  assemble_name (asm_out_file, name);
+  fputs (":\n", asm_out_file);
+}
+
+/* Implement TARGET_FUNCTION_OK_FOR_SIBCALL.  */
+
+static bool
+loongarch_function_ok_for_sibcall (tree decl ATTRIBUTE_UNUSED,
+				   tree exp ATTRIBUTE_UNUSED)
+{
+  /* Always OK.  */
+  return true;
+}
+
+/* Emit straight-line code to move LENGTH bytes from SRC to DEST.
+   Assume that the areas do not overlap.  */
+
+static void
+loongarch_block_move_straight (rtx dest, rtx src, HOST_WIDE_INT length)
+{
+  HOST_WIDE_INT offset, delta;
+  unsigned HOST_WIDE_INT bits;
+  int i;
+  machine_mode mode;
+  rtx *regs;
+
+  bits = MIN (BITS_PER_WORD, MIN (MEM_ALIGN (src), MEM_ALIGN (dest)));
+
+  mode = int_mode_for_size (bits, 0).require ();
+  delta = bits / BITS_PER_UNIT;
+
+  /* Allocate a buffer for the temporary registers.  */
+  regs = XALLOCAVEC (rtx, length / delta);
+
+  /* Load as many BITS-sized chunks as possible.  Use a normal load if
+     the source has enough alignment, otherwise use left/right pairs.  */
+  for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++)
+    {
+      regs[i] = gen_reg_rtx (mode);
+      loongarch_emit_move (regs[i], adjust_address (src, mode, offset));
+    }
+
+  for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++)
+    loongarch_emit_move (adjust_address (dest, mode, offset), regs[i]);
+
+  /* Mop up any left-over bytes.  */
+  if (offset < length)
+    {
+      src = adjust_address (src, BLKmode, offset);
+      dest = adjust_address (dest, BLKmode, offset);
+      move_by_pieces (dest, src, length - offset,
+		      MIN (MEM_ALIGN (src), MEM_ALIGN (dest)),
+		      (enum memop_ret) 0);
+    }
+}
+
+/* Helper function for doing a loop-based block operation on memory
+   reference MEM.  Each iteration of the loop will operate on LENGTH
+   bytes of MEM.
+
+   Create a new base register for use within the loop and point it to
+   the start of MEM.  Create a new memory reference that uses this
+   register.  Store them in *LOOP_REG and *LOOP_MEM respectively.  */
+
+static void
+loongarch_adjust_block_mem (rtx mem, HOST_WIDE_INT length, rtx *loop_reg,
+			    rtx *loop_mem)
+{
+  *loop_reg = copy_addr_to_reg (XEXP (mem, 0));
+
+  /* Although the new mem does not refer to a known location,
+     it does keep up to LENGTH bytes of alignment.  */
+  *loop_mem = change_address (mem, BLKmode, *loop_reg);
+  set_mem_align (*loop_mem, MIN (MEM_ALIGN (mem), length * BITS_PER_UNIT));
+}
+
+/* Move LENGTH bytes from SRC to DEST using a loop that moves BYTES_PER_ITER
+   bytes at a time.  LENGTH must be at least BYTES_PER_ITER.  Assume that
+   the memory regions do not overlap.  */
+
+static void
+loongarch_block_move_loop (rtx dest, rtx src, HOST_WIDE_INT length,
+			   HOST_WIDE_INT bytes_per_iter)
+{
+  rtx_code_label *label;
+  rtx src_reg, dest_reg, final_src, test;
+  HOST_WIDE_INT leftover;
+
+  leftover = length % bytes_per_iter;
+  length -= leftover;
+
+  /* Create registers and memory references for use within the loop.  */
+  loongarch_adjust_block_mem (src, bytes_per_iter, &src_reg, &src);
+  loongarch_adjust_block_mem (dest, bytes_per_iter, &dest_reg, &dest);
+
+  /* Calculate the value that SRC_REG should have after the last iteration
+     of the loop.  */
+  final_src = expand_simple_binop (Pmode, PLUS, src_reg, GEN_INT (length), 0,
+				   0, OPTAB_WIDEN);
+
+  /* Emit the start of the loop.  */
+  label = gen_label_rtx ();
+  emit_label (label);
+
+  /* Emit the loop body.  */
+  loongarch_block_move_straight (dest, src, bytes_per_iter);
+
+  /* Move on to the next block.  */
+  loongarch_emit_move (src_reg,
+		       plus_constant (Pmode, src_reg, bytes_per_iter));
+  loongarch_emit_move (dest_reg,
+		       plus_constant (Pmode, dest_reg, bytes_per_iter));
+
+  /* Emit the loop condition.  */
+  test = gen_rtx_NE (VOIDmode, src_reg, final_src);
+  if (Pmode == DImode)
+    emit_jump_insn (gen_cbranchdi4 (test, src_reg, final_src, label));
+  else
+    emit_jump_insn (gen_cbranchsi4 (test, src_reg, final_src, label));
+
+  /* Mop up any left-over bytes.  */
+  if (leftover)
+    loongarch_block_move_straight (dest, src, leftover);
+  else
+    /* Temporary fix for PR79150.  */
+    emit_insn (gen_nop ());
+}
+
+/* Expand a cpymemsi instruction, which copies LENGTH bytes from
+   memory reference SRC to memory reference DEST.  */
+
+bool
+loongarch_expand_block_move (rtx dest, rtx src, rtx length)
+{
+  int max_move_bytes = LARCH_MAX_MOVE_BYTES_STRAIGHT;
+
+  if (CONST_INT_P (length)
+      && INTVAL (length) <= loongarch_max_inline_memcpy_size)
+    {
+      if (INTVAL (length) <= max_move_bytes)
+	{
+	  loongarch_block_move_straight (dest, src, INTVAL (length));
+	  return true;
+	}
+      else if (optimize)
+	{
+	  loongarch_block_move_loop (dest, src, INTVAL (length),
+				     LARCH_MAX_MOVE_BYTES_PER_LOOP_ITER);
+	  return true;
+	}
+    }
+  return false;
+}
+
+/* Expand a QI or HI mode atomic memory operation.
+
+   GENERATOR contains a pointer to the gen_* function that generates
+   the SI mode underlying atomic operation using masks that we
+   calculate.
+
+   RESULT is the return register for the operation.  Its value is NULL
+   if unused.
+
+   MEM is the location of the atomic access.
+
+   OLDVAL is the first operand for the operation.
+
+   NEWVAL is the optional second operand for the operation.  Its value
+   is NULL if unused.  */
+
+void
+loongarch_expand_atomic_qihi (union loongarch_gen_fn_ptrs generator,
+			      rtx result, rtx mem, rtx oldval, rtx newval,
+			      rtx model)
+{
+  rtx orig_addr, memsi_addr, memsi, shift, shiftsi, unshifted_mask;
+  rtx unshifted_mask_reg, mask, inverted_mask, si_op;
+  rtx res = NULL;
+  machine_mode mode;
+
+  mode = GET_MODE (mem);
+
+  /* Compute the address of the containing SImode value.  */
+  orig_addr = force_reg (Pmode, XEXP (mem, 0));
+  memsi_addr = loongarch_force_binary (Pmode, AND, orig_addr,
+				       force_reg (Pmode, GEN_INT (-4)));
+
+  /* Create a memory reference for it.  */
+  memsi = gen_rtx_MEM (SImode, memsi_addr);
+  set_mem_alias_set (memsi, ALIAS_SET_MEMORY_BARRIER);
+  MEM_VOLATILE_P (memsi) = MEM_VOLATILE_P (mem);
+
+  /* Work out the byte offset of the QImode or HImode value,
+     counting from the least significant byte.  */
+  shift = loongarch_force_binary (Pmode, AND, orig_addr, GEN_INT (3));
+  /* Multiply by eight to convert the shift value from bytes to bits.  */
+  loongarch_emit_binary (ASHIFT, shift, shift, GEN_INT (3));
+
+  /* Make the final shift an SImode value, so that it can be used in
+     SImode operations.  */
+  shiftsi = force_reg (SImode, gen_lowpart (SImode, shift));
+
+  /* Set MASK to an inclusive mask of the QImode or HImode value.  */
+  unshifted_mask = GEN_INT (GET_MODE_MASK (mode));
+  unshifted_mask_reg = force_reg (SImode, unshifted_mask);
+  mask = loongarch_force_binary (SImode, ASHIFT, unshifted_mask_reg, shiftsi);
+
+  /* Compute the equivalent exclusive mask.  */
+  inverted_mask = gen_reg_rtx (SImode);
+  emit_insn (gen_rtx_SET (inverted_mask, gen_rtx_NOT (SImode, mask)));
+
+  /* Shift the old value into place.  */
+  if (oldval != const0_rtx)
+    {
+      oldval = convert_modes (SImode, mode, oldval, true);
+      oldval = force_reg (SImode, oldval);
+      oldval = loongarch_force_binary (SImode, ASHIFT, oldval, shiftsi);
+    }
+
+  /* Do the same for the new value.  */
+  if (newval && newval != const0_rtx)
+    {
+      newval = convert_modes (SImode, mode, newval, true);
+      newval = force_reg (SImode, newval);
+      newval = loongarch_force_binary (SImode, ASHIFT, newval, shiftsi);
+    }
+
+  /* Do the SImode atomic access.  */
+  if (result)
+    res = gen_reg_rtx (SImode);
+
+  if (newval)
+    si_op = generator.fn_7 (res, memsi, mask, inverted_mask, oldval, newval,
+			    model);
+  else if (result)
+    si_op = generator.fn_6 (res, memsi, mask, inverted_mask, oldval, model);
+  else
+    si_op = generator.fn_5 (memsi, mask, inverted_mask, oldval, model);
+
+  emit_insn (si_op);
+
+  if (result)
+    {
+      /* Shift and convert the result.  */
+      loongarch_emit_binary (AND, res, res, mask);
+      loongarch_emit_binary (LSHIFTRT, res, res, shiftsi);
+      loongarch_emit_move (result, gen_lowpart (GET_MODE (result), res));
+    }
+}
+
+/* Return true if (zero_extract OP WIDTH BITPOS) can be used as the
+   source of an "ext" instruction or the destination of an "ins"
+   instruction.  OP must be a register operand and the following
+   conditions must hold:
+
+   0 <= BITPOS < GET_MODE_BITSIZE (GET_MODE (op))
+   0 < WIDTH <= GET_MODE_BITSIZE (GET_MODE (op))
+   0 < BITPOS + WIDTH <= GET_MODE_BITSIZE (GET_MODE (op))
+
+   Also reject lengths equal to a word as they are better handled
+   by the move patterns.  */
+
+bool
+loongarch_use_ins_ext_p (rtx op, HOST_WIDE_INT width, HOST_WIDE_INT bitpos)
+{
+  if (!register_operand (op, VOIDmode)
+      || GET_MODE_BITSIZE (GET_MODE (op)) > BITS_PER_WORD)
+    return false;
+
+  if (!IN_RANGE (width, 1, GET_MODE_BITSIZE (GET_MODE (op)) - 1))
+    return false;
+
+  if (bitpos < 0 || bitpos + width > GET_MODE_BITSIZE (GET_MODE (op)))
+    return false;
+
+  return true;
+}
+
+/* Print the text for PRINT_OPERAND punctation character CH to FILE.
+   The punctuation characters are:
+
+   '.'	Print the name of the register with a hard-wired zero (zero or $r0).
+   '$'	Print the name of the stack pointer register (sp or $r3).
+
+   See also loongarch_init_print_operand_punct.  */
+
+static void
+loongarch_print_operand_punctuation (FILE *file, int ch)
+{
+  switch (ch)
+    {
+    case '.':
+      fputs (reg_names[GP_REG_FIRST + 0], file);
+      break;
+
+    case '$':
+      fputs (reg_names[STACK_POINTER_REGNUM], file);
+      break;
+
+    default:
+      gcc_unreachable ();
+      break;
+    }
+}
+
+/* Initialize loongarch_print_operand_punct.  */
+
+static void
+loongarch_init_print_operand_punct (void)
+{
+  const char *p;
+
+  for (p = ".$"; *p; p++)
+    loongarch_print_operand_punct[(unsigned char) *p] = true;
+}
+
+/* PRINT_OPERAND prefix LETTER refers to the integer branch instruction
+   associated with condition CODE.  Print the condition part of the
+   opcode to FILE.  */
+
+static void
+loongarch_print_int_branch_condition (FILE *file, enum rtx_code code,
+				      int letter)
+{
+  switch (code)
+    {
+    case EQ:
+    case NE:
+    case GT:
+    case GE:
+    case LT:
+    case LE:
+    case GTU:
+    case GEU:
+    case LTU:
+    case LEU:
+      /* Conveniently, the LoongArch names for these conditions are the same
+	 as their RTL equivalents.  */
+      fputs (GET_RTX_NAME (code), file);
+      break;
+
+    default:
+      output_operand_lossage ("'%%%c' is not a valid operand prefix", letter);
+      break;
+    }
+}
+
+/* Likewise floating-point branches.  */
+
+static void
+loongarch_print_float_branch_condition (FILE *file, enum rtx_code code,
+					int letter)
+{
+  switch (code)
+    {
+    case EQ:
+      fputs ("ceqz", file);
+      break;
+
+    case NE:
+      fputs ("cnez", file);
+      break;
+
+    default:
+      output_operand_lossage ("'%%%c' is not a valid operand prefix", letter);
+      break;
+    }
+}
+
+/* Implement TARGET_PRINT_OPERAND_PUNCT_VALID_P.  */
+
+static bool
+loongarch_print_operand_punct_valid_p (unsigned char code)
+{
+  return loongarch_print_operand_punct[code];
+}
+
+/* Return true if a FENCE should be emitted to before a memory access to
+   implement the release portion of memory model MODEL.  */
+
+static bool
+loongarch_memmodel_needs_rel_acq_fence (enum memmodel model)
+{
+  switch (model)
+    {
+      case MEMMODEL_ACQ_REL:
+      case MEMMODEL_SEQ_CST:
+      case MEMMODEL_SYNC_SEQ_CST:
+      case MEMMODEL_RELEASE:
+      case MEMMODEL_SYNC_RELEASE:
+      case MEMMODEL_ACQUIRE:
+      case MEMMODEL_CONSUME:
+      case MEMMODEL_SYNC_ACQUIRE:
+	return true;
+
+      case MEMMODEL_RELAXED:
+	return false;
+
+      default:
+	gcc_unreachable ();
+    }
+}
+
+/* Return true if a FENCE should be emitted to before a memory access to
+   implement the release portion of memory model MODEL.  */
+
+static bool
+loongarch_memmodel_needs_release_fence (enum memmodel model)
+{
+  switch (model)
+    {
+    case MEMMODEL_ACQ_REL:
+    case MEMMODEL_SEQ_CST:
+    case MEMMODEL_SYNC_SEQ_CST:
+    case MEMMODEL_RELEASE:
+    case MEMMODEL_SYNC_RELEASE:
+      return true;
+
+    case MEMMODEL_ACQUIRE:
+    case MEMMODEL_CONSUME:
+    case MEMMODEL_SYNC_ACQUIRE:
+    case MEMMODEL_RELAXED:
+      return false;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Implement TARGET_PRINT_OPERAND.  The LoongArch-specific operand codes are:
+
+   'X'	Print CONST_INT OP in hexadecimal format.
+   'x'	Print the low 16 bits of CONST_INT OP in hexadecimal format.
+   'd'	Print CONST_INT OP in decimal.
+   'm'	Print one less than CONST_INT OP in decimal.
+   'y'	Print exact log2 of CONST_INT OP in decimal.
+   'C'	Print the integer branch condition for comparison OP.
+   'N'	Print the inverse of the integer branch condition for comparison OP.
+   'F'	Print the FPU branch condition for comparison OP.
+   'W'	Print the inverse of the FPU branch condition for comparison OP.
+   'T'	Print 'f' for (eq:CC ...), 't' for (ne:CC ...),
+	      'z' for (eq:?I ...), 'n' for (ne:?I ...).
+   't'	Like 'T', but with the EQ/NE cases reversed
+   'Y'	Print loongarch_fp_conditions[INTVAL (OP)]
+   'Z'	Print OP and a comma for 8CC, otherwise print nothing.
+   'z'	Print $0 if OP is zero, otherwise print OP normally.
+   'b'	Print the address of a memory operand, without offset.
+   'V'	Print exact log2 of CONST_INT OP element 0 of a replicated
+	  CONST_VECTOR in decimal.
+   'A'	Print a _DB suffix if the memory model requires a release.
+   'G'	Print a DBAR insn if the memory model requires a release.
+   'i'	Print i if the operand is not a register.  */
+
+static void
+loongarch_print_operand (FILE *file, rtx op, int letter)
+{
+  enum rtx_code code;
+
+  if (loongarch_print_operand_punct_valid_p (letter))
+    {
+      loongarch_print_operand_punctuation (file, letter);
+      return;
+    }
+
+  gcc_assert (op);
+  code = GET_CODE (op);
+
+  switch (letter)
+    {
+    case 'X':
+      if (CONST_INT_P (op))
+	fprintf (file, HOST_WIDE_INT_PRINT_HEX, INTVAL (op));
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
+    case 'x':
+      if (CONST_INT_P (op))
+	fprintf (file, HOST_WIDE_INT_PRINT_HEX, INTVAL (op) & 0xffff);
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
+    case 'd':
+      if (CONST_INT_P (op))
+	fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
+    case 'm':
+      if (CONST_INT_P (op))
+	fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op) - 1);
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
+    case 'y':
+      if (CONST_INT_P (op))
+	{
+	  int val = exact_log2 (INTVAL (op));
+	  if (val != -1)
+	    fprintf (file, "%d", val);
+	  else
+	    output_operand_lossage ("invalid use of '%%%c'", letter);
+	}
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
+    case 'V':
+      if (GET_CODE (op) == CONST_VECTOR)
+	{
+	  machine_mode mode = GET_MODE_INNER (GET_MODE (op));
+	  unsigned HOST_WIDE_INT val = UINTVAL (CONST_VECTOR_ELT (op, 0));
+	  int vlog2 = exact_log2 (val & GET_MODE_MASK (mode));
+	  if (vlog2 != -1)
+	    fprintf (file, "%d", vlog2);
+	  else
+	    output_operand_lossage ("invalid use of '%%%c'", letter);
+	}
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
+    case 'C':
+      loongarch_print_int_branch_condition (file, code, letter);
+      break;
+
+    case 'N':
+      loongarch_print_int_branch_condition (file, reverse_condition (code),
+					    letter);
+      break;
+
+    case 'F':
+      loongarch_print_float_branch_condition (file, code, letter);
+      break;
+
+    case 'W':
+      loongarch_print_float_branch_condition (file, reverse_condition (code),
+					      letter);
+      break;
+
+    case 'T':
+    case 't':
+      {
+	int truth = (code == NE) == (letter == 'T');
+	fputc ("zfnt"[truth * 2 + FCC_REG_P (REGNO (XEXP (op, 0)))], file);
+      }
+      break;
+
+    case 'Y':
+      if (code == CONST_INT
+	  && UINTVAL (op) < ARRAY_SIZE (loongarch_fp_conditions))
+	fputs (loongarch_fp_conditions[UINTVAL (op)], file);
+      else
+	output_operand_lossage ("'%%%c' is not a valid operand prefix",
+				letter);
+      break;
+
+    case 'Z':
+      loongarch_print_operand (file, op, 0);
+      fputc (',', file);
+      break;
+
+    case 'A':
+      if (loongarch_memmodel_needs_rel_acq_fence ((enum memmodel) INTVAL (op)))
+	fputs ("_db", file);
+      break;
+
+    case 'G':
+      if (loongarch_memmodel_needs_release_fence ((enum memmodel) INTVAL (op)))
+	fputs ("dbar\t0", file);
+      break;
+
+    case 'i':
+      if (code != REG)
+	fputs ("i", file);
+      break;
+
+    default:
+      switch (code)
+	{
+	case REG:
+	  {
+	    unsigned int regno = REGNO (op);
+	    if (letter && letter != 'z')
+	      output_operand_lossage ("invalid use of '%%%c'", letter);
+	    fprintf (file, "%s", reg_names[regno]);
+	  }
+	  break;
+
+	case MEM:
+	  if (letter == 'D')
+	    output_address (GET_MODE (op),
+			    plus_constant (Pmode, XEXP (op, 0), 4));
+	  else if (letter == 'b')
+	    {
+	      gcc_assert (REG_P (XEXP (op, 0)));
+	      loongarch_print_operand (file, XEXP (op, 0), 0);
+	    }
+	  else if (letter && letter != 'z')
+	    output_operand_lossage ("invalid use of '%%%c'", letter);
+	  else
+	    output_address (GET_MODE (op), XEXP (op, 0));
+	  break;
+
+	default:
+	  if (letter == 'z' && op == CONST0_RTX (GET_MODE (op)))
+	    fputs (reg_names[GP_REG_FIRST], file);
+	  else if (letter && letter != 'z')
+	    output_operand_lossage ("invalid use of '%%%c'", letter);
+	  else
+	    output_addr_const (file, loongarch_strip_unspec_address (op));
+	  break;
+	}
+    }
+}
+
+/* Implement TARGET_PRINT_OPERAND_ADDRESS.  */
+
+static void
+loongarch_print_operand_address (FILE *file, machine_mode /* mode  */, rtx x)
+{
+  struct loongarch_address_info addr;
+
+  if (loongarch_classify_address (&addr, x, word_mode, true))
+    switch (addr.type)
+      {
+      case ADDRESS_REG:
+	fprintf (file, "%s,", reg_names[REGNO (addr.reg)]);
+	loongarch_print_operand (file, addr.offset, 0);
+	return;
+
+      case ADDRESS_REG_REG:
+	fprintf (file, "%s,%s", reg_names[REGNO (addr.reg)],
+				reg_names[REGNO (addr.offset)]);
+	return;
+
+      case ADDRESS_CONST_INT:
+	fprintf (file, "%s,", reg_names[GP_REG_FIRST]);
+	output_addr_const (file, x);
+	return;
+
+      case ADDRESS_SYMBOLIC:
+	output_addr_const (file, loongarch_strip_unspec_address (x));
+	return;
+      }
+  if (GET_CODE (x) == CONST_INT)
+    output_addr_const (file, x);
+  else
+    gcc_unreachable ();
+}
+
+/* Implement TARGET_ASM_SELECT_RTX_SECTION.  */
+
+static section *
+loongarch_select_rtx_section (machine_mode mode, rtx x,
+			      unsigned HOST_WIDE_INT align)
+{
+  /* ??? Consider using mergeable small data sections.  */
+  if (loongarch_rtx_constant_in_small_data_p (mode))
+    return get_named_section (NULL, ".sdata", 0);
+
+  return default_elf_select_rtx_section (mode, x, align);
+}
+
+/* Implement TARGET_ASM_FUNCTION_RODATA_SECTION.
+
+   The complication here is that, with the combination
+   !TARGET_ABSOLUTE_ABICALLS , jump tables will use
+   absolute addresses, and should therefore not be included in the
+   read-only part of a DSO.  Handle such cases by selecting a normal
+   data section instead of a read-only one.  The logic apes that in
+   default_function_rodata_section.  */
+
+static section *
+loongarch_function_rodata_section (tree decl, bool)
+{
+  return default_function_rodata_section (decl, false);
+}
+
+/* Implement TARGET_IN_SMALL_DATA_P.  */
+
+static bool
+loongarch_in_small_data_p (const_tree decl)
+{
+  int size;
+
+  if (TREE_CODE (decl) == STRING_CST || TREE_CODE (decl) == FUNCTION_DECL)
+    return false;
+
+  if (TREE_CODE (decl) == VAR_DECL && DECL_SECTION_NAME (decl) != 0)
+    {
+      const char *name;
+
+      /* Reject anything that isn't in a known small-data section.  */
+      name = DECL_SECTION_NAME (decl);
+      if (strcmp (name, ".sdata") != 0 && strcmp (name, ".sbss") != 0)
+	return false;
+
+      /* If a symbol is defined externally, the assembler will use the
+	 usual -G rules when deciding how to implement macros.  */
+      if (!DECL_EXTERNAL (decl))
+	return true;
+    }
+
+  /* We have traditionally not treated zero-sized objects as small data,
+     so this is now effectively part of the ABI.  */
+  size = int_size_in_bytes (TREE_TYPE (decl));
+  return size > 0 && size <= g_switch_value;
+}
+
+/* Implement TARGET_USE_ANCHORS_FOR_SYMBOL_P.  We don't want to use
+   anchors for small data: the GP register acts as an anchor in that
+   case.  We also don't want to use them for PC-relative accesses,
+   where the PC acts as an anchor.  */
+
+static bool
+loongarch_use_anchors_for_symbol_p (const_rtx symbol)
+{
+  return default_use_anchors_for_symbol_p (symbol);
+}
+
+/* The LoongArch debug format wants all automatic variables and arguments
+   to be in terms of the virtual frame pointer (stack pointer before
+   any adjustment in the function), while the LoongArch linker wants
+   the frame pointer to be the stack pointer after the initial
+   adjustment.  So, we do the adjustment here.  The arg pointer (which
+   is eliminated) points to the virtual frame pointer, while the frame
+   pointer (which may be eliminated) points to the stack pointer after
+   the initial adjustments.  */
+
+HOST_WIDE_INT
+loongarch_debugger_offset (rtx addr, HOST_WIDE_INT offset)
+{
+  rtx offset2 = const0_rtx;
+  rtx reg = eliminate_constant_term (addr, &offset2);
+
+  if (offset == 0)
+    offset = INTVAL (offset2);
+
+  if (reg == stack_pointer_rtx
+      || reg == frame_pointer_rtx
+      || reg == hard_frame_pointer_rtx)
+    {
+      offset -= cfun->machine->frame.total_size;
+      if (reg == hard_frame_pointer_rtx)
+	offset += cfun->machine->frame.hard_frame_pointer_offset;
+    }
+
+  return offset;
+}
+
+/* Implement ASM_OUTPUT_EXTERNAL.  */
+
+void
+loongarch_output_external (FILE *file, tree decl, const char *name)
+{
+  default_elf_asm_output_external (file, decl, name);
+
+  /* We output the name if and only if TREE_SYMBOL_REFERENCED is
+     set in order to avoid putting out names that are never really
+     used.  */
+  if (TREE_SYMBOL_REFERENCED (DECL_ASSEMBLER_NAME (decl)))
+    {
+      if (loongarch_in_small_data_p (decl))
+	{
+	  /* When using assembler macros, emit .extern directives for
+	     all small-data externs so that the assembler knows how
+	     big they are.
+
+	     In most cases it would be safe (though pointless) to emit
+	     .externs for other symbols too.  One exception is when an
+	     object is within the -G limit but declared by the user to
+	     be in a section other than .sbss or .sdata.  */
+	  fputs ("\t.extern\t", file);
+	  assemble_name (file, name);
+	  fprintf (file, ", " HOST_WIDE_INT_PRINT_DEC "\n",
+		   int_size_in_bytes (TREE_TYPE (decl)));
+	}
+    }
+}
+
+/* Implement TARGET_ASM_OUTPUT_DWARF_DTPREL.  */
+
+static void ATTRIBUTE_UNUSED
+loongarch_output_dwarf_dtprel (FILE *file, int size, rtx x)
+{
+  switch (size)
+    {
+    case 4:
+      fputs ("\t.dtprelword\t", file);
+      break;
+
+    case 8:
+      fputs ("\t.dtpreldword\t", file);
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+  output_addr_const (file, x);
+  fputs ("+0x8000", file);
+}
+
+/* Implement TARGET_DWARF_FRAME_REG_MODE.  */
+
+static machine_mode
+loongarch_dwarf_frame_reg_mode (int regno)
+{
+  machine_mode mode = default_dwarf_frame_reg_mode (regno);
+
+  return mode;
+}
+
+/* Implement ASM_OUTPUT_ASCII.  */
+
+void
+loongarch_output_ascii (FILE *stream, const char *string, size_t len)
+{
+  size_t i;
+  int cur_pos;
+
+  cur_pos = 17;
+  fprintf (stream, "\t.ascii\t\"");
+  for (i = 0; i < len; i++)
+    {
+      int c;
+
+      c = (unsigned char) string[i];
+      if (ISPRINT (c))
+	{
+	  if (c == '\\' || c == '\"')
+	    {
+	      putc ('\\', stream);
+	      cur_pos++;
+	    }
+	  putc (c, stream);
+	  cur_pos++;
+	}
+      else
+	{
+	  fprintf (stream, "\\%03o", c);
+	  cur_pos += 4;
+	}
+
+      if (cur_pos > 72 && i + 1 < len)
+	{
+	  cur_pos = 17;
+	  fprintf (stream, "\"\n\t.ascii\t\"");
+	}
+    }
+  fprintf (stream, "\"\n");
+}
+
+/* Emit either a label, .comm, or .lcomm directive.  When using assembler
+   macros, mark the symbol as written so that loongarch_asm_output_external
+   won't emit an .extern for it.  STREAM is the output file, NAME is the
+   name of the symbol, INIT_STRING is the string that should be written
+   before the symbol and FINAL_STRING is the string that should be
+   written after it.  FINAL_STRING is a printf format that consumes the
+   remaining arguments.  */
+
+void
+loongarch_declare_object (FILE *stream, const char *name,
+			  const char *init_string, const char *final_string,
+			  ...)
+{
+  va_list ap;
+
+  fputs (init_string, stream);
+  assemble_name (stream, name);
+  va_start (ap, final_string);
+  vfprintf (stream, final_string, ap);
+  va_end (ap);
+
+  tree name_tree = get_identifier (name);
+  TREE_ASM_WRITTEN (name_tree) = 1;
+}
+
+/* Declare a common object of SIZE bytes using asm directive INIT_STRING.
+   NAME is the name of the object and ALIGN is the required alignment
+   in bytes.  TAKES_ALIGNMENT_P is true if the directive takes a third
+   alignment argument.  */
+
+void
+loongarch_declare_common_object (FILE *stream, const char *name,
+				 const char *init_string,
+				 unsigned HOST_WIDE_INT size,
+				 unsigned int align, bool takes_alignment_p)
+{
+  if (!takes_alignment_p)
+    {
+      size += (align / BITS_PER_UNIT) - 1;
+      size -= size % (align / BITS_PER_UNIT);
+      loongarch_declare_object (stream, name, init_string,
+				"," HOST_WIDE_INT_PRINT_UNSIGNED "\n", size);
+    }
+  else
+    loongarch_declare_object (stream, name, init_string,
+			      "," HOST_WIDE_INT_PRINT_UNSIGNED ",%u\n", size,
+			      align / BITS_PER_UNIT);
+}
+
+/* Implement ASM_OUTPUT_ALIGNED_DECL_COMMON.  This is usually the same as the
+   elfos.h version.  */
+
+void
+loongarch_output_aligned_decl_common (FILE *stream,
+				      tree decl ATTRIBUTE_UNUSED,
+				      const char *name,
+				      unsigned HOST_WIDE_INT size,
+				      unsigned int align)
+{
+  loongarch_declare_common_object (stream, name, "\n\t.comm\t", size, align,
+				   true);
+}
+
+#ifdef ASM_OUTPUT_SIZE_DIRECTIVE
+extern int size_directive_output;
+
+/* Implement ASM_DECLARE_OBJECT_NAME.  This is like most of the standard ELF
+   definitions except that it uses loongarch_declare_object to emit the label.
+*/
+
+void
+loongarch_declare_object_name (FILE *stream, const char *name,
+			       tree decl ATTRIBUTE_UNUSED)
+{
+#ifdef ASM_OUTPUT_TYPE_DIRECTIVE
+#ifdef USE_GNU_UNIQUE_OBJECT
+  /* As in elfos.h.  */
+  if (USE_GNU_UNIQUE_OBJECT && DECL_ONE_ONLY (decl)
+      && (!DECL_ARTIFICIAL (decl) || !TREE_READONLY (decl)))
+    ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "gnu_unique_object");
+  else
+#endif
+    ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "object");
+#endif
+
+  size_directive_output = 0;
+  if (!flag_inhibit_size_directive && DECL_SIZE (decl))
+    {
+      HOST_WIDE_INT size;
+
+      size_directive_output = 1;
+      size = int_size_in_bytes (TREE_TYPE (decl));
+      ASM_OUTPUT_SIZE_DIRECTIVE (stream, name, size);
+    }
+
+  loongarch_declare_object (stream, name, "", ":\n");
+}
+
+/* Implement ASM_FINISH_DECLARE_OBJECT.  This is generic ELF stuff.  */
+
+void
+loongarch_finish_declare_object (FILE *stream, tree decl, int top_level,
+				 int at_end)
+{
+  const char *name;
+
+  name = XSTR (XEXP (DECL_RTL (decl), 0), 0);
+  if (!flag_inhibit_size_directive
+      && DECL_SIZE (decl) != 0
+      && !at_end
+      && top_level
+      && DECL_INITIAL (decl) == error_mark_node
+      && !size_directive_output)
+    {
+      HOST_WIDE_INT size;
+
+      size_directive_output = 1;
+      size = int_size_in_bytes (TREE_TYPE (decl));
+      ASM_OUTPUT_SIZE_DIRECTIVE (stream, name, size);
+    }
+}
+#endif
+
+/* Mark text contents as code or data, mainly for the purpose of correct
+   disassembly.  Emit a local symbol and set its type appropriately for
+   that purpose.  */
+
+void
+loongarch_set_text_contents_type (FILE *file ATTRIBUTE_UNUSED,
+				  const char *prefix ATTRIBUTE_UNUSED,
+				  unsigned long num ATTRIBUTE_UNUSED,
+				  bool function_p ATTRIBUTE_UNUSED)
+{
+#ifdef ASM_OUTPUT_TYPE_DIRECTIVE
+  char buf[(sizeof (num) * 10) / 4 + 2];
+  const char *fnname;
+  char *sname;
+  rtx symbol;
+
+  sprintf (buf, "%lu", num);
+  symbol = XEXP (DECL_RTL (current_function_decl), 0);
+  fnname = targetm.strip_name_encoding (XSTR (symbol, 0));
+  sname = ACONCAT ((prefix, fnname, "_", buf, NULL));
+
+  ASM_OUTPUT_TYPE_DIRECTIVE (file, sname, function_p ? "function" : "object");
+  assemble_name (file, sname);
+  fputs (":\n", file);
+#endif
+}
+
+/* Implement TARGET_ASM_FILE_START.  */
+
+static void
+loongarch_file_start (void)
+{
+  default_file_start ();
+}
+
+
+/* Implement TARGET_FRAME_POINTER_REQUIRED.  */
+
+static bool
+loongarch_frame_pointer_required (void)
+{
+  /* If the function contains dynamic stack allocations, we need to
+     use the frame pointer to access the static parts of the frame.  */
+  if (cfun->calls_alloca)
+    return true;
+
+  return false;
+}
+
+/* Implement TARGET_CAN_ELIMINATE.  Make sure that we're not trying
+   to eliminate to the wrong hard frame pointer.  */
+
+static bool
+loongarch_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to)
+{
+  return (to == HARD_FRAME_POINTER_REGNUM || to == STACK_POINTER_REGNUM);
+}
+
+/* Implement RETURN_ADDR_RTX.  We do not support moving back to a
+   previous frame.  */
+
+rtx
+loongarch_return_addr (int count, rtx frame ATTRIBUTE_UNUSED)
+{
+  if (count != 0)
+    return const0_rtx;
+
+  return get_hard_reg_initial_val (Pmode, RETURN_ADDR_REGNUM);
+}
+
+/* Emit code to change the current function's return address to
+   ADDRESS.  SCRATCH is available as a scratch register, if needed.
+   ADDRESS and SCRATCH are both word-mode GPRs.  */
+
+void
+loongarch_set_return_address (rtx address, rtx scratch)
+{
+  rtx slot_address;
+
+  gcc_assert (BITSET_P (cfun->machine->frame.mask, RETURN_ADDR_REGNUM));
+
+  if (frame_pointer_needed)
+    slot_address = loongarch_add_offset (scratch, hard_frame_pointer_rtx,
+					 -UNITS_PER_WORD);
+  else
+    slot_address = loongarch_add_offset (scratch, stack_pointer_rtx,
+					 cfun->machine->frame.gp_sp_offset);
+
+  loongarch_emit_move (gen_frame_mem (GET_MODE (address), slot_address),
+		       address);
+}
+
+/* Implement ASM_DECLARE_FUNCTION_NAME.  */
+
+void
+loongarch_declare_function_name (FILE *stream ATTRIBUTE_UNUSED,
+				 const char *name,
+				 tree fndecl ATTRIBUTE_UNUSED)
+{
+  loongarch_start_function_definition (name);
+}
+
+/* Return true if register REGNO can store a value of mode MODE.
+   The result of this function is cached in loongarch_hard_regno_mode_ok.  */
+
+static bool
+loongarch_hard_regno_mode_ok_uncached (unsigned int regno, machine_mode mode)
+{
+  unsigned int size;
+  enum mode_class mclass;
+
+  if (mode == FCCmode)
+    return FCC_REG_P (regno);
+
+  size = GET_MODE_SIZE (mode);
+  mclass = GET_MODE_CLASS (mode);
+
+  if (GP_REG_P (regno))
+    return ((regno - GP_REG_FIRST) & 1) == 0 || size <= UNITS_PER_WORD;
+
+  if (FP_REG_P (regno)
+      && (((regno - FP_REG_FIRST) % MAX_FPRS_PER_FMT) == 0
+	  || (MIN_FPRS_PER_FMT == 1 && size <= UNITS_PER_FPREG)))
+    {
+      if (mclass == MODE_FLOAT
+	  || mclass == MODE_COMPLEX_FLOAT
+	  || mclass == MODE_VECTOR_FLOAT)
+	return size <= UNITS_PER_FPVALUE;
+
+      /* Allow integer modes that fit into a single register.  We need
+	 to put integers into FPRs when using instructions like CVT
+	 and TRUNC.  There's no point allowing sizes smaller than a word,
+	 because the FPU has no appropriate load/store instructions.  */
+      if (mclass == MODE_INT)
+	return size >= MIN_UNITS_PER_WORD && size <= UNITS_PER_FPREG;
+    }
+
+  return false;
+}
+
+/* Implement TARGET_HARD_REGNO_MODE_OK.  */
+
+static bool
+loongarch_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
+{
+  return loongarch_hard_regno_mode_ok_p[mode][regno];
+}
+
+/* Return nonzero if register OLD_REG can be renamed to register NEW_REG.  */
+
+bool
+loongarch_hard_regno_rename_ok (unsigned int old_reg ATTRIBUTE_UNUSED,
+				unsigned int new_reg ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
+/* Implement TARGET_HARD_REGNO_NREGS.  */
+
+static unsigned int
+loongarch_hard_regno_nregs (unsigned int regno, machine_mode mode)
+{
+  if (FCC_REG_P (regno))
+    /* The size of FP status registers is always 4, because they only hold
+       FCCmode values, and FCCmode is always considered to be 4 bytes wide.  */
+    return (GET_MODE_SIZE (mode) + 3) / 4;
+
+  if (FP_REG_P (regno))
+    return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG;
+
+  /* All other registers are word-sized.  */
+  return (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+}
+
+/* Implement CLASS_MAX_NREGS, taking the maximum of the cases
+   in loongarch_hard_regno_nregs.  */
+
+int
+loongarch_class_max_nregs (enum reg_class rclass, machine_mode mode)
+{
+  int size;
+  HARD_REG_SET left;
+
+  size = 0x8000;
+  left = reg_class_contents[rclass];
+  if (hard_reg_set_intersect_p (left, reg_class_contents[(int) FCC_REGS]))
+    {
+      if (loongarch_hard_regno_mode_ok (FCC_REG_FIRST, mode))
+	size = MIN (size, 4);
+
+      left &= ~reg_class_contents[FCC_REGS];
+    }
+  if (hard_reg_set_intersect_p (left, reg_class_contents[(int) FP_REGS]))
+    {
+      if (loongarch_hard_regno_mode_ok (FP_REG_FIRST, mode))
+	size = MIN (size, UNITS_PER_FPREG);
+
+      left &= ~reg_class_contents[FP_REGS];
+    }
+  if (!hard_reg_set_empty_p (left))
+    size = MIN (size, UNITS_PER_WORD);
+  return (GET_MODE_SIZE (mode) + size - 1) / size;
+}
+
+/* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
+
+static bool
+loongarch_can_change_mode_class (machine_mode from, machine_mode to,
+				 reg_class_t rclass)
+{
+  /* Allow conversions between different Loongson integer vectors,
+     and between those vectors and DImode.  */
+  if (GET_MODE_SIZE (from) == 8 && GET_MODE_SIZE (to) == 8
+      && INTEGRAL_MODE_P (from) && INTEGRAL_MODE_P (to))
+    return true;
+
+  return !reg_classes_intersect_p (FP_REGS, rclass);
+}
+
+/* Return true if moves in mode MODE can use the FPU's fmov.fmt instruction,
+*/
+
+static bool
+loongarch_mode_ok_for_mov_fmt_p (machine_mode mode)
+{
+  switch (mode)
+    {
+    case E_FCCmode:
+    case E_SFmode:
+      return TARGET_HARD_FLOAT;
+
+    case E_DFmode:
+      return TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT;
+
+    case E_V2SFmode:
+      return 0;
+
+    default:
+      return 0;
+    }
+}
+
+/* Implement TARGET_MODES_TIEABLE_P.  */
+
+static bool
+loongarch_modes_tieable_p (machine_mode mode1, machine_mode mode2)
+{
+  /* FPRs allow no mode punning, so it's not worth tying modes if we'd
+     prefer to put one of them in FPRs.  */
+  return (mode1 == mode2
+	  || (!loongarch_mode_ok_for_mov_fmt_p (mode1)
+	      && !loongarch_mode_ok_for_mov_fmt_p (mode2)));
+}
+
+/* Implement TARGET_PREFERRED_RELOAD_CLASS.  */
+
+static reg_class_t
+loongarch_preferred_reload_class (rtx x, reg_class_t rclass)
+{
+  if (reg_class_subset_p (FP_REGS, rclass)
+      && loongarch_mode_ok_for_mov_fmt_p (GET_MODE (x)))
+    return FP_REGS;
+
+  if (reg_class_subset_p (GR_REGS, rclass))
+    rclass = GR_REGS;
+
+  return rclass;
+}
+
+/* RCLASS is a class involved in a REGISTER_MOVE_COST calculation.
+   Return a "canonical" class to represent it in later calculations.  */
+
+static reg_class_t
+loongarch_canonicalize_move_class (reg_class_t rclass)
+{
+  if (reg_class_subset_p (rclass, GENERAL_REGS))
+    rclass = GENERAL_REGS;
+
+  return rclass;
+}
+
+/* Return the cost of moving a value from a register of class FROM to a GPR.
+   Return 0 for classes that are unions of other classes handled by this
+   function.  */
+
+static int
+loongarch_move_to_gpr_cost (reg_class_t from)
+{
+  switch (from)
+    {
+    case GENERAL_REGS:
+      /* MOVE macro.  */
+      return 2;
+
+    case FP_REGS:
+      /* MOVFR2GR, etc.  */
+      return 4;
+
+    default:
+      return 0;
+    }
+}
+
+/* Return the cost of moving a value from a GPR to a register of class TO.
+   Return 0 for classes that are unions of other classes handled by this
+   function.  */
+
+static int
+loongarch_move_from_gpr_cost (reg_class_t to)
+{
+  switch (to)
+    {
+    case GENERAL_REGS:
+      /*MOVE macro.  */
+      return 2;
+
+    case FP_REGS:
+      /* MOVGR2FR, etc.  */
+      return 4;
+
+    default:
+      return 0;
+    }
+}
+
+/* Implement TARGET_REGISTER_MOVE_COST.  Return 0 for classes that are the
+   maximum of the move costs for subclasses; regclass will work out
+   the maximum for us.  */
+
+static int
+loongarch_register_move_cost (machine_mode mode, reg_class_t from,
+			      reg_class_t to)
+{
+  reg_class_t dregs;
+  int cost1, cost2;
+
+  from = loongarch_canonicalize_move_class (from);
+  to = loongarch_canonicalize_move_class (to);
+
+  /* Handle moves that can be done without using general-purpose registers.  */
+  if (from == FP_REGS)
+    {
+      if (to == FP_REGS && loongarch_mode_ok_for_mov_fmt_p (mode))
+	/* FMOV.FMT.  */
+	return 4;
+    }
+
+  /* Handle cases in which only one class deviates from the ideal.  */
+  dregs = GENERAL_REGS;
+  if (from == dregs)
+    return loongarch_move_from_gpr_cost (to);
+  if (to == dregs)
+    return loongarch_move_to_gpr_cost (from);
+
+  /* Handles cases that require a GPR temporary.  */
+  cost1 = loongarch_move_to_gpr_cost (from);
+  if (cost1 != 0)
+    {
+      cost2 = loongarch_move_from_gpr_cost (to);
+      if (cost2 != 0)
+	return cost1 + cost2;
+    }
+
+  return 0;
+}
+
+/* Implement TARGET_MEMORY_MOVE_COST.  */
+
+static int
+loongarch_memory_move_cost (machine_mode mode, reg_class_t rclass, bool in)
+{
+  return (loongarch_cost->memory_latency
+	  + memory_move_secondary_cost (mode, rclass, in));
+}
+
+/* Implement TARGET_SECONDARY_MEMORY_NEEDED.
+
+   This can be achieved using MOVFRH2GR.S/MOVGR2FRH.W when these instructions
+   are available but otherwise moves must go via memory.  Using
+   MOVGR2FR/MOVFR2GR to access the lower-half of these registers would require
+   a forbidden single-precision access.  We require all double-word moves to
+   use memory because adding even and odd floating-point registers classes
+   would have a significant impact on the backend.  */
+
+static bool
+loongarch_secondary_memory_needed (machine_mode mode ATTRIBUTE_UNUSED,
+				   reg_class_t class1, reg_class_t class2)
+{
+  /* Ignore spilled pseudos.  */
+  if (lra_in_progress && (class1 == NO_REGS || class2 == NO_REGS))
+    return false;
+
+  return false;
+}
+
+/* Return the register class required for a secondary register when
+   copying between one of the registers in RCLASS and value X, which
+   has mode MODE.  X is the source of the move if IN_P, otherwise it
+   is the destination.  Return NO_REGS if no secondary register is
+   needed.  */
+
+static reg_class_t
+loongarch_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x,
+			    reg_class_t rclass, machine_mode mode,
+			    secondary_reload_info *sri ATTRIBUTE_UNUSED)
+{
+  int regno;
+
+  regno = true_regnum (x);
+
+  if (reg_class_subset_p (rclass, FP_REGS))
+    {
+      if (regno < 0
+	  || (MEM_P (x)
+	      && (GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8)))
+	/* In this case we can use fld.s, fst.s, fld.d or fst.d.  */
+	return NO_REGS;
+
+      if (GP_REG_P (regno) || x == CONST0_RTX (mode))
+	/* In this case we can use movgr2fr.s, movfr2gr.s, movgr2fr.d or
+	 * movfr2gr.d.  */
+	return NO_REGS;
+
+      if (CONSTANT_P (x) && !targetm.cannot_force_const_mem (mode, x))
+	/* We can force the constant to memory and use fld.s
+	   and fld.d.  As above, we will use pairs of lwc1s if
+	   ldc1 is not supported.  */
+	return NO_REGS;
+
+      if (FP_REG_P (regno) && loongarch_mode_ok_for_mov_fmt_p (mode))
+	/* In this case we can use fmov.{s/d}.  */
+	return NO_REGS;
+
+      /* Otherwise, we need to reload through an integer register.  */
+      return GR_REGS;
+    }
+  if (FP_REG_P (regno))
+    return reg_class_subset_p (rclass, GR_REGS) ? NO_REGS : GR_REGS;
+
+  return NO_REGS;
+}
+
+/* Implement TARGET_VALID_POINTER_MODE.  */
+
+static bool
+loongarch_valid_pointer_mode (scalar_int_mode mode)
+{
+  return mode == SImode || (TARGET_64BIT && mode == DImode);
+}
+
+/* Implement TARGET_SCALAR_MODE_SUPPORTED_P.  */
+
+static bool
+loongarch_scalar_mode_supported_p (scalar_mode mode)
+{
+  if (ALL_FIXED_POINT_MODE_P (mode)
+      && GET_MODE_PRECISION (mode) <= 2 * BITS_PER_WORD)
+    return true;
+
+  return default_scalar_mode_supported_p (mode);
+}
+
+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
+
+static machine_mode
+loongarch_preferred_simd_mode (scalar_mode mode ATTRIBUTE_UNUSED)
+{
+  return word_mode;
+}
+
+/* Return the assembly code for INSN, which has the operands given by
+   OPERANDS, and which branches to OPERANDS[0] if some condition is true.
+   BRANCH_IF_TRUE is the asm template that should be used if OPERANDS[0]
+   is in range of a direct branch.  BRANCH_IF_FALSE is an inverted
+   version of BRANCH_IF_TRUE.  */
+
+const char *
+loongarch_output_conditional_branch (rtx_insn *insn, rtx *operands,
+				     const char *branch_if_true,
+				     const char *branch_if_false)
+{
+  unsigned int length;
+  rtx taken;
+
+  gcc_assert (LABEL_P (operands[0]));
+
+  length = get_attr_length (insn);
+  if (length <= 4)
+    {
+      return branch_if_true;
+    }
+
+  /* Generate a reversed branch around a direct jump.  */
+  rtx_code_label *not_taken = gen_label_rtx ();
+  taken = operands[0];
+
+  /* Generate the reversed branch to NOT_TAKEN.  */
+  operands[0] = not_taken;
+  output_asm_insn (branch_if_false, operands);
+
+  output_asm_insn ("b\t%0", &taken);
+
+  /* Output NOT_TAKEN.  */
+  targetm.asm_out.internal_label (asm_out_file, "L",
+				  CODE_LABEL_NUMBER (not_taken));
+  return "";
+}
+
+/* Return the assembly code for INSN, which branches to OPERANDS[0]
+   if some equality condition is true.  The condition is given by
+   OPERANDS[1] if !INVERTED_P, otherwise it is the inverse of
+   OPERANDS[1].  OPERANDS[2] is the comparison's first operand;
+   OPERANDS[3] is the second operand and may be zero or a register.  */
+
+const char *
+loongarch_output_equal_conditional_branch (rtx_insn *insn, rtx *operands,
+					   bool inverted_p)
+{
+  const char *branch[2];
+  if (operands[3] == const0_rtx)
+    {
+      branch[!inverted_p] = LARCH_BRANCH ("b%C1z", "%2,%0");
+      branch[inverted_p] = LARCH_BRANCH ("b%N1z", "%2,%0");
+    }
+  else
+    {
+      branch[!inverted_p] = LARCH_BRANCH ("b%C1", "%2,%z3,%0");
+      branch[inverted_p] = LARCH_BRANCH ("b%N1", "%2,%z3,%0");
+    }
+
+  return loongarch_output_conditional_branch (insn, operands, branch[1],
+					      branch[0]);
+}
+
+/* Return the assembly code for INSN, which branches to OPERANDS[0]
+   if some ordering condition is true.  The condition is given by
+   OPERANDS[1] if !INVERTED_P, otherwise it is the inverse of
+   OPERANDS[1].  OPERANDS[2] is the comparison's first operand;
+   OPERANDS[3] is the second operand and may be zero or a register.  */
+
+const char *
+loongarch_output_order_conditional_branch (rtx_insn *insn, rtx *operands,
+					   bool inverted_p)
+{
+  const char *branch[2];
+
+  /* Make BRANCH[1] branch to OPERANDS[0] when the condition is true.
+     Make BRANCH[0] branch on the inverse condition.  */
+  if (operands[3] != const0_rtx)
+    {
+      /* Handle degenerate cases that should not, but do, occur.  */
+      if (REGNO (operands[2]) == REGNO (operands[3]))
+	{
+	  switch (GET_CODE (operands[1]))
+	    {
+	    case LT:
+	    case LTU:
+	    case GT:
+	    case GTU:
+	      inverted_p = !inverted_p;
+	      /* Fall through.  */
+	    case LE:
+	    case LEU:
+	    case GE:
+	    case GEU:
+	      branch[!inverted_p] = LARCH_BRANCH ("b", "%0");
+	      branch[inverted_p] = "\t# branch never";
+	      break;
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+      else
+	{
+	  switch (GET_CODE (operands[1]))
+	    {
+	    case LE:
+	    case LEU:
+	    case GT:
+	    case GTU:
+	    case LT:
+	    case LTU:
+	    case GE:
+	    case GEU:
+	      branch[!inverted_p] = LARCH_BRANCH ("b%C1", "%2,%3,%0");
+	      branch[inverted_p] = LARCH_BRANCH ("b%N1", "%2,%3,%0");
+	      break;
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+    }
+  else
+    {
+      switch (GET_CODE (operands[1]))
+	{
+	  /* These cases are equivalent to comparisons against zero.  */
+	case LEU:
+	case GTU:
+	case LTU:
+	case GEU:
+	case LE:
+	case GT:
+	case LT:
+	case GE:
+	  branch[!inverted_p] = LARCH_BRANCH ("b%C1", "%2,$r0,%0");
+	  branch[inverted_p] = LARCH_BRANCH ("b%N1", "%2,$r0,%0");
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+    }
+  return loongarch_output_conditional_branch (insn, operands, branch[1],
+					      branch[0]);
+}
+
+/* Return the assembly code for DIV.{W/D} instruction DIVISION, which has
+   the operands given by OPERANDS.  Add in a divide-by-zero check if needed.
+   */
+
+const char *
+loongarch_output_division (const char *division, rtx *operands)
+{
+  const char *s;
+
+  s = division;
+  if (TARGET_CHECK_ZERO_DIV)
+    {
+      output_asm_insn (s, operands);
+      s = "bne\t%2,%.,1f\n\tbreak\t7\n1:";
+    }
+  return s;
+}
+
+/* Implement TARGET_SCHED_ADJUST_COST.  We assume that anti and output
+   dependencies have no cost.  */
+
+static int
+loongarch_adjust_cost (rtx_insn *, int dep_type, rtx_insn *, int cost,
+		       unsigned int)
+{
+  if (dep_type != 0 && (dep_type != REG_DEP_OUTPUT))
+    return 0;
+  return cost;
+}
+
+/* Return the number of instructions that can be issued per cycle.  */
+
+static int
+loongarch_issue_rate (void)
+{
+  if ((unsigned long) __ACTUAL_TUNE < N_TUNE_TYPES)
+    return loongarch_cpu_issue_rate[__ACTUAL_TUNE];
+  else
+    return 1;
+}
+
+/* Implement TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD.  This should
+   be as wide as the scheduling freedom in the DFA.  */
+
+static int
+loongarch_multipass_dfa_lookahead (void)
+{
+  if ((unsigned long) __ACTUAL_TUNE < N_ARCH_TYPES)
+    return loongarch_cpu_multipass_dfa_lookahead[__ACTUAL_TUNE];
+  else
+    return 0;
+}
+
+/* Implement TARGET_SCHED_REORDER.  */
+
+static int
+loongarch_sched_reorder (FILE *file ATTRIBUTE_UNUSED,
+			 int verbose ATTRIBUTE_UNUSED,
+			 rtx_insn **ready ATTRIBUTE_UNUSED,
+			 int *nreadyp ATTRIBUTE_UNUSED,
+			 int cycle ATTRIBUTE_UNUSED)
+{
+  return loongarch_issue_rate ();
+}
+
+/* Implement TARGET_SCHED_REORDER2.  */
+
+static int
+loongarch_sched_reorder2 (FILE *file ATTRIBUTE_UNUSED,
+			  int verbose ATTRIBUTE_UNUSED,
+			  rtx_insn **ready ATTRIBUTE_UNUSED,
+			  int *nreadyp ATTRIBUTE_UNUSED,
+			  int cycle ATTRIBUTE_UNUSED)
+{
+  return cached_can_issue_more;
+}
+
+/* Implement TARGET_SCHED_INIT.  */
+
+static void
+loongarch_sched_init (FILE *file ATTRIBUTE_UNUSED,
+		      int verbose ATTRIBUTE_UNUSED,
+		      int max_ready ATTRIBUTE_UNUSED)
+{}
+
+/* Implement TARGET_SCHED_VARIABLE_ISSUE.  */
+
+static int
+loongarch_variable_issue (FILE *file ATTRIBUTE_UNUSED,
+			  int verbose ATTRIBUTE_UNUSED, rtx_insn *insn,
+			  int more)
+{
+  /* Ignore USEs and CLOBBERs; don't count them against the issue rate.  */
+  if (USEFUL_INSN_P (insn))
+    {
+      if (get_attr_type (insn) != TYPE_GHOST)
+	more--;
+    }
+
+  /* Instructions of type 'multi' should all be split before
+     the second scheduling pass.  */
+  gcc_assert (!reload_completed
+	      || recog_memoized (insn) < 0
+	      || get_attr_type (insn) != TYPE_MULTI);
+
+  cached_can_issue_more = more;
+  return more;
+}
+
+/* A structure representing the state of the processor pipeline.
+   Used by the loongarch_sim_* family of functions.  */
+
+struct loongarch_sim
+{
+  /* The maximum number of instructions that can be issued in a cycle.
+     (Caches loongarch_issue_rate.)  */
+  unsigned int issue_rate;
+
+  /* The current simulation time.  */
+  unsigned int time;
+
+  /* How many more instructions can be issued in the current cycle.  */
+  unsigned int insns_left;
+
+  /* LAST_SET[X].INSN is the last instruction to set register X.
+     LAST_SET[X].TIME is the time at which that instruction was issued.
+     INSN is null if no instruction has yet set register X.  */
+  struct
+  {
+    rtx_insn *insn;
+    unsigned int time;
+  } last_set[FIRST_PSEUDO_REGISTER];
+
+  /* The pipeline's current DFA state.  */
+  state_t dfa_state;
+};
+
+/* Reset STATE to the initial simulation state.  */
+
+static void
+loongarch_sim_reset (struct loongarch_sim *state)
+{
+  curr_state = state->dfa_state;
+
+  state->time = 0;
+  state->insns_left = state->issue_rate;
+  memset (&state->last_set, 0, sizeof (state->last_set));
+  state_reset (curr_state);
+
+  targetm.sched.init (0, false, 0);
+  advance_state (curr_state);
+}
+
+/* Initialize STATE before its first use.  DFA_STATE points to an
+   allocated but uninitialized DFA state.  */
+
+static void
+loongarch_sim_init (struct loongarch_sim *state, state_t dfa_state)
+{
+  if (targetm.sched.init_dfa_pre_cycle_insn)
+    targetm.sched.init_dfa_pre_cycle_insn ();
+
+  if (targetm.sched.init_dfa_post_cycle_insn)
+    targetm.sched.init_dfa_post_cycle_insn ();
+
+  state->issue_rate = loongarch_issue_rate ();
+  state->dfa_state = dfa_state;
+  loongarch_sim_reset (state);
+}
+
+/* Set up costs based on the current architecture and tuning settings.  */
+
+static void
+loongarch_set_tuning_info (void)
+{
+  dfa_start ();
+
+  struct loongarch_sim state;
+  loongarch_sim_init (&state, alloca (state_size ()));
+
+  dfa_finish ();
+}
+
+/* Implement TARGET_EXPAND_TO_RTL_HOOK.  */
+
+static void
+loongarch_expand_to_rtl_hook (void)
+{
+  /* We need to call this at a point where we can safely create sequences
+     of instructions, so TARGET_OVERRIDE_OPTIONS is too early.  We also
+     need to call it at a point where the DFA infrastructure is not
+     already in use, so we can't just call it lazily on demand.
+
+     At present, loongarch_tuning_info is only needed during post-expand
+     RTL passes such as split_insns, so this hook should be early enough.
+     We may need to move the call elsewhere if loongarch_tuning_info starts
+     to be used for other things (such as rtx_costs, or expanders that
+     could be called during gimple optimization).  */
+
+  loongarch_set_tuning_info ();
+}
+
+
+/* Given that we have an rtx of the form (prefetch ... WRITE LOCALITY),
+   return the first operand of the associated PREF or PREFX insn.  */
+
+rtx
+loongarch_prefetch_cookie (rtx write, rtx locality)
+{
+  /* store_streamed / load_streamed.  */
+  if (INTVAL (locality) <= 0)
+    return GEN_INT (INTVAL (write) + 4);
+
+  /* store / load.  */
+  if (INTVAL (locality) <= 2)
+    return write;
+
+  /* store_retained / load_retained.  */
+  return GEN_INT (INTVAL (write) + 6);
+}
+
+/* Implement TARGET_ASM_OUTPUT_MI_THUNK.  Generate rtl rather than asm text
+   in order to avoid duplicating too much logic from elsewhere.  */
+
+static void
+loongarch_output_mi_thunk (FILE *file, tree thunk_fndecl ATTRIBUTE_UNUSED,
+			   HOST_WIDE_INT delta, HOST_WIDE_INT vcall_offset,
+			   tree function)
+{
+  const char *fnname = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (thunk_fndecl));
+  rtx this_rtx, temp1, temp2, fnaddr;
+  rtx_insn *insn;
+  bool use_sibcall_p;
+
+  /* Pretend to be a post-reload pass while generating rtl.  */
+  reload_completed = 1;
+
+  /* Mark the end of the (empty) prologue.  */
+  emit_note (NOTE_INSN_PROLOGUE_END);
+
+  /* Determine if we can use a sibcall to call FUNCTION directly.  */
+  fnaddr = XEXP (DECL_RTL (function), 0);
+  use_sibcall_p = const_call_insn_operand (fnaddr, Pmode);
+
+  /* We need two temporary registers in some cases.  */
+  temp1 = gen_rtx_REG (Pmode, 12);
+  temp2 = gen_rtx_REG (Pmode, 13);
+
+  /* Find out which register contains the "this" pointer.  */
+  if (aggregate_value_p (TREE_TYPE (TREE_TYPE (function)), function))
+    this_rtx = gen_rtx_REG (Pmode, GP_ARG_FIRST + 1);
+  else
+    this_rtx = gen_rtx_REG (Pmode, GP_ARG_FIRST);
+
+  /* Add DELTA to THIS_RTX.  */
+  if (delta != 0)
+    {
+      rtx offset = GEN_INT (delta);
+      if (!IMM12_OPERAND (delta))
+	{
+	  loongarch_emit_move (temp1, offset);
+	  offset = temp1;
+	}
+      emit_insn (gen_add3_insn (this_rtx, this_rtx, offset));
+    }
+
+  /* If needed, add *(*THIS_RTX + VCALL_OFFSET) to THIS_RTX.  */
+  if (vcall_offset != 0)
+    {
+      rtx addr;
+
+      /* Set TEMP1 to *THIS_RTX.  */
+      loongarch_emit_move (temp1, gen_rtx_MEM (Pmode, this_rtx));
+
+      /* Set ADDR to a legitimate address for *THIS_RTX + VCALL_OFFSET.  */
+      addr = loongarch_add_offset (temp2, temp1, vcall_offset);
+
+      /* Load the offset and add it to THIS_RTX.  */
+      loongarch_emit_move (temp1, gen_rtx_MEM (Pmode, addr));
+      emit_insn (gen_add3_insn (this_rtx, this_rtx, temp1));
+    }
+
+  /* Jump to the target function.  Use a sibcall if direct jumps are
+     allowed, otherwise load the address into a register first.  */
+  if (use_sibcall_p)
+    {
+      insn = emit_call_insn (gen_sibcall_internal (fnaddr, const0_rtx));
+      SIBLING_CALL_P (insn) = 1;
+    }
+  else
+    {
+      loongarch_emit_move (temp1, fnaddr);
+      emit_jump_insn (gen_indirect_jump (temp1));
+    }
+
+  /* Run just enough of rest_of_compilation.  This sequence was
+     "borrowed" from alpha.c.  */
+  insn = get_insns ();
+  split_all_insns_noflow ();
+  shorten_branches (insn);
+  assemble_start_function (thunk_fndecl, fnname);
+  final_start_function (insn, file, 1);
+  final (insn, file, 1);
+  final_end_function ();
+  assemble_end_function (thunk_fndecl, fnname);
+
+  /* Stop pretending to be a post-reload pass.  */
+  reload_completed = 0;
+}
+
+/* Allocate a chunk of memory for per-function machine-dependent data.  */
+
+static struct machine_function *
+loongarch_init_machine_status (void)
+{
+  return ggc_cleared_alloc<machine_function> ();
+}
+
+static void
+loongarch_option_override_internal (struct gcc_options *opts,
+				    struct gcc_options *opts_set);
+
+/* Implement TARGET_OPTION_OVERRIDE.  */
+
+static void
+loongarch_option_override (void)
+{
+  loongarch_option_override_internal (&global_options, &global_options_set);
+}
+
+static void
+loongarch_option_override_internal (struct gcc_options *opts,
+				    struct gcc_options *opts_set)
+{
+  int i, regno, mode;
+
+  (void) opts_set;
+
+  if (flag_pic)
+    g_switch_value = 0;
+
+  /* Handle target-specific options: compute defaults/conflicts etc.  */
+  loongarch_config_target (&la_target, la_opt_switches,
+			   la_opt_cpu_arch, la_opt_cpu_tune, la_opt_fpu,
+			   la_opt_abi_base, la_opt_abi_ext, la_opt_cmodel, 0);
+
+  /* End of code shared with GAS.  */
+  if (TARGET_ABI_LP64)
+    flag_pcc_struct_return = 0;
+
+  /* Decide which rtx_costs structure to use.  */
+  if (optimize_size)
+    loongarch_cost = &loongarch_rtx_cost_optimize_size;
+  else
+    loongarch_cost = &loongarch_cpu_rtx_cost_data[__ACTUAL_TUNE];
+
+  /* If the user hasn't specified a branch cost, use the processor's
+     default.  */
+  if (loongarch_branch_cost == 0)
+    loongarch_branch_cost = loongarch_cost->branch_cost;
+
+
+  switch (la_target.cmodel)
+    {
+      case CMODEL_TINY_STATIC:
+      case CMODEL_EXTREME:
+	if (opts->x_flag_plt)
+	  error ("code model %qs and %qs not support %s mode",
+		 "tiny-static", "extreme", "plt");
+	break;
+
+      case CMODEL_NORMAL:
+      case CMODEL_TINY:
+      case CMODEL_LARGE:
+	break;
+
+      default:
+	gcc_unreachable ();
+    }
+
+  loongarch_init_print_operand_punct ();
+
+  /* Set up array to map GCC register number to debug register number.
+     Ignore the special purpose register numbers.  */
+
+  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+    {
+      if (GP_REG_P (i) || FP_REG_P (i))
+	loongarch_dwarf_regno[i] = i;
+      else
+	loongarch_dwarf_regno[i] = INVALID_REGNUM;
+    }
+
+  /* Set up loongarch_hard_regno_mode_ok.  */
+  for (mode = 0; mode < MAX_MACHINE_MODE; mode++)
+    for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
+      loongarch_hard_regno_mode_ok_p[mode][regno]
+	= loongarch_hard_regno_mode_ok_uncached (regno, (machine_mode) mode);
+
+  /* Function to allocate machine-dependent function status.  */
+  init_machine_status = &loongarch_init_machine_status;
+}
+
+/* Implement TARGET_CONDITIONAL_REGISTER_USAGE.  */
+
+static void
+loongarch_conditional_register_usage (void)
+{
+  if (!TARGET_HARD_FLOAT)
+    accessible_reg_set &= ~(reg_class_contents[FP_REGS]
+			    | reg_class_contents[FCC_REGS]);
+}
+
+/* Implement EH_USES.  */
+
+bool
+loongarch_eh_uses (unsigned int regno ATTRIBUTE_UNUSED)
+{
+  return false;
+}
+
+/* Implement EPILOGUE_USES.  */
+
+bool
+loongarch_epilogue_uses (unsigned int regno)
+{
+  /* Say that the epilogue uses the return address register.  Note that
+     in the case of sibcalls, the values "used by the epilogue" are
+     considered live at the start of the called function.  */
+  if (regno == RETURN_ADDR_REGNUM)
+    return true;
+
+  return false;
+}
+
+bool
+loongarch_load_store_bonding_p (rtx *operands, machine_mode mode, bool load_p)
+{
+  rtx reg1, reg2, mem1, mem2, base1, base2;
+  enum reg_class rc1, rc2;
+  HOST_WIDE_INT offset1, offset2;
+
+  if (load_p)
+    {
+      reg1 = operands[0];
+      reg2 = operands[2];
+      mem1 = operands[1];
+      mem2 = operands[3];
+    }
+  else
+    {
+      reg1 = operands[1];
+      reg2 = operands[3];
+      mem1 = operands[0];
+      mem2 = operands[2];
+    }
+
+  if (loongarch_address_insns (XEXP (mem1, 0), mode, false) == 0
+      || loongarch_address_insns (XEXP (mem2, 0), mode, false) == 0)
+    return false;
+
+  loongarch_split_plus (XEXP (mem1, 0), &base1, &offset1);
+  loongarch_split_plus (XEXP (mem2, 0), &base2, &offset2);
+
+  /* Base regs do not match.  */
+  if (!REG_P (base1) || !rtx_equal_p (base1, base2))
+    return false;
+
+  /* Either of the loads is clobbering base register.  It is legitimate to bond
+     loads if second load clobbers base register.  However, hardware does not
+     support such bonding.  */
+  if (load_p
+      && (REGNO (reg1) == REGNO (base1) || (REGNO (reg2) == REGNO (base1))))
+    return false;
+
+  /* Loading in same registers.  */
+  if (load_p && REGNO (reg1) == REGNO (reg2))
+    return false;
+
+  /* The loads/stores are not of same type.  */
+  rc1 = REGNO_REG_CLASS (REGNO (reg1));
+  rc2 = REGNO_REG_CLASS (REGNO (reg2));
+  if (rc1 != rc2 && !reg_class_subset_p (rc1, rc2)
+      && !reg_class_subset_p (rc2, rc1))
+    return false;
+
+  if (abs (offset1 - offset2) != GET_MODE_SIZE (mode))
+    return false;
+
+  return true;
+}
+
+/* Implement TARGET_TRAMPOLINE_INIT.  */
+
+static void
+loongarch_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
+{
+  rtx addr, end_addr, mem;
+  rtx trampoline[8];
+  unsigned int i, j;
+  HOST_WIDE_INT end_addr_offset, static_chain_offset, target_function_offset;
+
+  /* Work out the offsets of the pointers from the start of the
+     trampoline code.  */
+  end_addr_offset = TRAMPOLINE_CODE_SIZE;
+  static_chain_offset = end_addr_offset;
+  target_function_offset = static_chain_offset + GET_MODE_SIZE (ptr_mode);
+
+  /* Get pointers to the beginning and end of the code block.  */
+  addr = force_reg (Pmode, XEXP (m_tramp, 0));
+  end_addr
+    = loongarch_force_binary (Pmode, PLUS, addr, GEN_INT (end_addr_offset));
+
+#define OP(X) gen_int_mode (X, SImode)
+
+  /* Build up the code in TRAMPOLINE.  */
+  i = 0;
+  /*pcaddi $static_chain,0
+    ld.[dw] $tmp,$static_chain,target_function_offset
+    ld.[dw] $static_chain,$static_chain,static_chain_offset
+    jirl $r0,$tmp,0  */
+  trampoline[i++] = OP (0x18000000 | (STATIC_CHAIN_REGNUM - GP_REG_FIRST));
+  trampoline[i++] = OP ((ptr_mode == DImode ? 0x28c00000 : 0x28800000)
+			| 19 /* $t7  */
+			| ((STATIC_CHAIN_REGNUM - GP_REG_FIRST) << 5)
+			| ((target_function_offset & 0xfff) << 10));
+  trampoline[i++] = OP ((ptr_mode == DImode ? 0x28c00000 : 0x28800000)
+			| (STATIC_CHAIN_REGNUM - GP_REG_FIRST)
+			| ((STATIC_CHAIN_REGNUM - GP_REG_FIRST) << 5)
+			| ((static_chain_offset & 0xfff) << 10));
+  trampoline[i++] = OP (0x4c000000 | (19 << 5));
+#undef OP
+
+  for (j = 0; j < i; j++)
+   {
+     mem = adjust_address (m_tramp, SImode, j * GET_MODE_SIZE (SImode));
+     loongarch_emit_move (mem, trampoline[j]);
+   }
+
+  /* Set up the static chain pointer field.  */
+  mem = adjust_address (m_tramp, ptr_mode, static_chain_offset);
+  loongarch_emit_move (mem, chain_value);
+
+  /* Set up the target function field.  */
+  mem = adjust_address (m_tramp, ptr_mode, target_function_offset);
+  loongarch_emit_move (mem, XEXP (DECL_RTL (fndecl), 0));
+
+  /* Flush the code part of the trampoline.  */
+  emit_insn (gen_add3_insn (end_addr, addr, GEN_INT (TRAMPOLINE_SIZE)));
+  emit_insn (gen_clear_cache (addr, end_addr));
+}
+
+/* Implement TARGET_SHIFT_TRUNCATION_MASK.  We want to keep the default
+   behavior of TARGET_SHIFT_TRUNCATION_MASK for non-vector modes.  */
+
+static unsigned HOST_WIDE_INT
+loongarch_shift_truncation_mask (machine_mode mode)
+{
+  return GET_MODE_BITSIZE (mode) - 1;
+}
+
+/* Implement TARGET_SCHED_REASSOCIATION_WIDTH.  */
+
+static int
+loongarch_sched_reassociation_width (unsigned int opc ATTRIBUTE_UNUSED,
+				     machine_mode mode ATTRIBUTE_UNUSED)
+{
+  return 1;
+}
+
+/* Implement HARD_REGNO_CALLER_SAVE_MODE.  */
+
+machine_mode
+loongarch_hard_regno_caller_save_mode (unsigned int regno, unsigned int nregs,
+				       machine_mode mode)
+{
+  /* For performance, avoid saving/restoring upper parts of a register
+     by returning MODE as save mode when the mode is known.  */
+  if (mode == VOIDmode)
+    return choose_hard_reg_mode (regno, nregs, NULL);
+  else
+    return mode;
+}
+
+/* Implement TARGET_CASE_VALUES_THRESHOLD.  */
+
+unsigned int
+loongarch_case_values_threshold (void)
+{
+  return default_case_values_threshold ();
+}
+
+/* Implement TARGET_SPILL_CLASS.  */
+
+static reg_class_t
+loongarch_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
+		       machine_mode mode ATTRIBUTE_UNUSED)
+{
+  return NO_REGS;
+}
+
+/* Implement TARGET_LRA_P.  */
+
+static bool
+loongarch_lra_p (void)
+{
+  return loongarch_lra_flag;
+}
+
+/* Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS.  */
+
+static reg_class_t
+loongarch_ira_change_pseudo_allocno_class (int regno,
+					   reg_class_t allocno_class,
+					   reg_class_t best_class     \
+					   ATTRIBUTE_UNUSED)
+{
+  /* LRA will allocate an FPR for an integer mode pseudo instead of spilling
+     to memory if an FPR is present in the allocno class.  It is rare that
+     we actually need to place an integer mode value in an FPR so where
+     possible limit the allocation to GR_REGS.  This will slightly pessimize
+     code that involves integer to/from float conversions as these will have
+     to reload into FPRs in LRA.  Such reloads are sometimes eliminated and
+     sometimes only partially eliminated.  We choose to take this penalty
+     in order to eliminate usage of FPRs in code that does not use floating
+     point data.
+
+     This change has a similar effect to increasing the cost of FPR->GPR
+     register moves for integer modes so that they are higher than the cost
+     of memory but changing the allocno class is more reliable.
+
+     This is also similar to forbidding integer mode values in FPRs entirely
+     but this would lead to an inconsistency in the integer to/from float
+     instructions that say integer mode values must be placed in FPRs.  */
+  if (INTEGRAL_MODE_P (PSEUDO_REGNO_MODE (regno)) && allocno_class == ALL_REGS)
+    return GR_REGS;
+  return allocno_class;
+}
+
+/* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
+
+/* This function is equivalent to default_promote_function_mode_always_promote
+   except that it returns a promoted mode even if type is NULL_TREE.  This is
+   needed by libcalls which have no type (only a mode) such as fixed conversion
+   routines that take a signed or unsigned char/short argument and convert it
+   to a fixed type.  */
+
+static machine_mode
+loongarch_promote_function_mode (const_tree type ATTRIBUTE_UNUSED,
+				 machine_mode mode,
+				 int *punsignedp ATTRIBUTE_UNUSED,
+				 const_tree fntype ATTRIBUTE_UNUSED,
+				 int for_return ATTRIBUTE_UNUSED)
+{
+  int unsignedp;
+
+  if (type != NULL_TREE)
+    return promote_mode (type, mode, punsignedp);
+
+  unsignedp = *punsignedp;
+  PROMOTE_MODE (mode, unsignedp, type);
+  *punsignedp = unsignedp;
+  return mode;
+}
+
+/* Implement TARGET_TRULY_NOOP_TRUNCATION.  */
+
+static bool
+loongarch_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec)
+{
+  return !TARGET_64BIT || inprec <= 32 || outprec > 32;
+}
+
+/* Implement TARGET_CONSTANT_ALIGNMENT.  */
+
+static HOST_WIDE_INT
+loongarch_constant_alignment (const_tree exp, HOST_WIDE_INT align)
+{
+  if (TREE_CODE (exp) == STRING_CST || TREE_CODE (exp) == CONSTRUCTOR)
+    return MAX (align, BITS_PER_WORD);
+  return align;
+}
+
+/* Implement TARGET_STARTING_FRAME_OFFSET.  See loongarch_compute_frame_info
+   for details about the frame layout.  */
+
+static HOST_WIDE_INT
+loongarch_starting_frame_offset (void)
+{
+  if (FRAME_GROWS_DOWNWARD)
+    return 0;
+  return crtl->outgoing_args_size;
+}
+
+/* Initialize the GCC target structure.  */
+#undef TARGET_ASM_ALIGNED_HI_OP
+#define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
+#undef TARGET_ASM_ALIGNED_SI_OP
+#define TARGET_ASM_ALIGNED_SI_OP "\t.word\t"
+#undef TARGET_ASM_ALIGNED_DI_OP
+#define TARGET_ASM_ALIGNED_DI_OP "\t.dword\t"
+
+#undef TARGET_OPTION_OVERRIDE
+#define TARGET_OPTION_OVERRIDE loongarch_option_override
+
+#undef TARGET_LEGITIMIZE_ADDRESS
+#define TARGET_LEGITIMIZE_ADDRESS loongarch_legitimize_address
+
+#undef TARGET_ASM_SELECT_RTX_SECTION
+#define TARGET_ASM_SELECT_RTX_SECTION loongarch_select_rtx_section
+#undef TARGET_ASM_FUNCTION_RODATA_SECTION
+#define TARGET_ASM_FUNCTION_RODATA_SECTION loongarch_function_rodata_section
+
+#undef TARGET_SCHED_INIT
+#define TARGET_SCHED_INIT loongarch_sched_init
+#undef TARGET_SCHED_REORDER
+#define TARGET_SCHED_REORDER loongarch_sched_reorder
+#undef TARGET_SCHED_REORDER2
+#define TARGET_SCHED_REORDER2 loongarch_sched_reorder2
+#undef TARGET_SCHED_VARIABLE_ISSUE
+#define TARGET_SCHED_VARIABLE_ISSUE loongarch_variable_issue
+#undef TARGET_SCHED_ADJUST_COST
+#define TARGET_SCHED_ADJUST_COST loongarch_adjust_cost
+#undef TARGET_SCHED_ISSUE_RATE
+#define TARGET_SCHED_ISSUE_RATE loongarch_issue_rate
+#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
+#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
+  loongarch_multipass_dfa_lookahead
+
+#undef TARGET_FUNCTION_OK_FOR_SIBCALL
+#define TARGET_FUNCTION_OK_FOR_SIBCALL loongarch_function_ok_for_sibcall
+
+#undef TARGET_VALID_POINTER_MODE
+#define TARGET_VALID_POINTER_MODE loongarch_valid_pointer_mode
+#undef TARGET_REGISTER_MOVE_COST
+#define TARGET_REGISTER_MOVE_COST loongarch_register_move_cost
+#undef TARGET_MEMORY_MOVE_COST
+#define TARGET_MEMORY_MOVE_COST loongarch_memory_move_cost
+#undef TARGET_RTX_COSTS
+#define TARGET_RTX_COSTS loongarch_rtx_costs
+#undef TARGET_ADDRESS_COST
+#define TARGET_ADDRESS_COST loongarch_address_cost
+
+#undef TARGET_IN_SMALL_DATA_P
+#define TARGET_IN_SMALL_DATA_P loongarch_in_small_data_p
+
+#undef TARGET_PREFERRED_RELOAD_CLASS
+#define TARGET_PREFERRED_RELOAD_CLASS loongarch_preferred_reload_class
+
+#undef TARGET_EXPAND_TO_RTL_HOOK
+#define TARGET_EXPAND_TO_RTL_HOOK loongarch_expand_to_rtl_hook
+#undef TARGET_ASM_FILE_START
+#define TARGET_ASM_FILE_START loongarch_file_start
+#undef TARGET_ASM_FILE_START_FILE_DIRECTIVE
+#define TARGET_ASM_FILE_START_FILE_DIRECTIVE true
+
+#undef TARGET_EXPAND_BUILTIN_VA_START
+#define TARGET_EXPAND_BUILTIN_VA_START loongarch_va_start
+
+#undef TARGET_PROMOTE_FUNCTION_MODE
+#define TARGET_PROMOTE_FUNCTION_MODE loongarch_promote_function_mode
+#undef TARGET_RETURN_IN_MEMORY
+#define TARGET_RETURN_IN_MEMORY loongarch_return_in_memory
+
+#undef TARGET_FUNCTION_VALUE
+#define TARGET_FUNCTION_VALUE loongarch_function_value
+#undef TARGET_LIBCALL_VALUE
+#define TARGET_LIBCALL_VALUE loongarch_libcall_value
+
+#undef TARGET_ASM_OUTPUT_MI_THUNK
+#define TARGET_ASM_OUTPUT_MI_THUNK loongarch_output_mi_thunk
+#undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
+#define TARGET_ASM_CAN_OUTPUT_MI_THUNK \
+  hook_bool_const_tree_hwi_hwi_const_tree_true
+
+#undef TARGET_PRINT_OPERAND
+#define TARGET_PRINT_OPERAND loongarch_print_operand
+#undef TARGET_PRINT_OPERAND_ADDRESS
+#define TARGET_PRINT_OPERAND_ADDRESS loongarch_print_operand_address
+#undef TARGET_PRINT_OPERAND_PUNCT_VALID_P
+#define TARGET_PRINT_OPERAND_PUNCT_VALID_P \
+  loongarch_print_operand_punct_valid_p
+
+#undef TARGET_SETUP_INCOMING_VARARGS
+#define TARGET_SETUP_INCOMING_VARARGS loongarch_setup_incoming_varargs
+#undef TARGET_STRICT_ARGUMENT_NAMING
+#define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true
+#undef TARGET_MUST_PASS_IN_STACK
+#define TARGET_MUST_PASS_IN_STACK must_pass_in_stack_var_size
+#undef TARGET_PASS_BY_REFERENCE
+#define TARGET_PASS_BY_REFERENCE loongarch_pass_by_reference
+#undef TARGET_ARG_PARTIAL_BYTES
+#define TARGET_ARG_PARTIAL_BYTES loongarch_arg_partial_bytes
+#undef TARGET_FUNCTION_ARG
+#define TARGET_FUNCTION_ARG loongarch_function_arg
+#undef TARGET_FUNCTION_ARG_ADVANCE
+#define TARGET_FUNCTION_ARG_ADVANCE loongarch_function_arg_advance
+#undef TARGET_FUNCTION_ARG_BOUNDARY
+#define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary
+
+#undef TARGET_SCALAR_MODE_SUPPORTED_P
+#define TARGET_SCALAR_MODE_SUPPORTED_P loongarch_scalar_mode_supported_p
+
+#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE
+#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE loongarch_preferred_simd_mode
+
+#undef TARGET_INIT_BUILTINS
+#define TARGET_INIT_BUILTINS loongarch_init_builtins
+#undef TARGET_BUILTIN_DECL
+#define TARGET_BUILTIN_DECL loongarch_builtin_decl
+#undef TARGET_EXPAND_BUILTIN
+#define TARGET_EXPAND_BUILTIN loongarch_expand_builtin
+
+/* The generic ELF target does not always have TLS support.  */
+#ifdef HAVE_AS_TLS
+#undef TARGET_HAVE_TLS
+#define TARGET_HAVE_TLS HAVE_AS_TLS
+#endif
+
+#undef TARGET_CANNOT_FORCE_CONST_MEM
+#define TARGET_CANNOT_FORCE_CONST_MEM loongarch_cannot_force_const_mem
+
+#undef TARGET_LEGITIMATE_CONSTANT_P
+#define TARGET_LEGITIMATE_CONSTANT_P loongarch_legitimate_constant_p
+
+#undef TARGET_ATTRIBUTE_TABLE
+#define TARGET_ATTRIBUTE_TABLE loongarch_attribute_table
+/* All our function attributes are related to how out-of-line copies should
+   be compiled or called.  They don't in themselves prevent inlining.  */
+#undef TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P
+#define TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P hook_bool_const_tree_true
+
+#undef TARGET_USE_BLOCKS_FOR_CONSTANT_P
+#define TARGET_USE_BLOCKS_FOR_CONSTANT_P hook_bool_mode_const_rtx_true
+#undef TARGET_USE_ANCHORS_FOR_SYMBOL_P
+#define TARGET_USE_ANCHORS_FOR_SYMBOL_P loongarch_use_anchors_for_symbol_p
+
+#ifdef HAVE_AS_DTPRELWORD
+#undef TARGET_ASM_OUTPUT_DWARF_DTPREL
+#define TARGET_ASM_OUTPUT_DWARF_DTPREL loongarch_output_dwarf_dtprel
+#endif
+#undef TARGET_DWARF_FRAME_REG_MODE
+#define TARGET_DWARF_FRAME_REG_MODE loongarch_dwarf_frame_reg_mode
+
+#undef TARGET_LEGITIMATE_ADDRESS_P
+#define TARGET_LEGITIMATE_ADDRESS_P loongarch_legitimate_address_p
+
+#undef TARGET_FRAME_POINTER_REQUIRED
+#define TARGET_FRAME_POINTER_REQUIRED loongarch_frame_pointer_required
+
+#undef TARGET_CAN_ELIMINATE
+#define TARGET_CAN_ELIMINATE loongarch_can_eliminate
+
+#undef TARGET_CONDITIONAL_REGISTER_USAGE
+#define TARGET_CONDITIONAL_REGISTER_USAGE loongarch_conditional_register_usage
+
+#undef TARGET_TRAMPOLINE_INIT
+#define TARGET_TRAMPOLINE_INIT loongarch_trampoline_init
+
+#undef TARGET_SHIFT_TRUNCATION_MASK
+#define TARGET_SHIFT_TRUNCATION_MASK loongarch_shift_truncation_mask
+
+#undef TARGET_SCHED_REASSOCIATION_WIDTH
+#define TARGET_SCHED_REASSOCIATION_WIDTH loongarch_sched_reassociation_width
+
+#undef TARGET_CASE_VALUES_THRESHOLD
+#define TARGET_CASE_VALUES_THRESHOLD loongarch_case_values_threshold
+
+#undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
+#define TARGET_ATOMIC_ASSIGN_EXPAND_FENV loongarch_atomic_assign_expand_fenv
+
+#undef TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS
+#define TARGET_CALL_FUSAGE_CONTAINS_NON_CALLEE_CLOBBERS true
+
+#undef TARGET_SPILL_CLASS
+#define TARGET_SPILL_CLASS loongarch_spill_class
+#undef TARGET_LRA_P
+#define TARGET_LRA_P loongarch_lra_p
+#undef TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS
+#define TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS \
+  loongarch_ira_change_pseudo_allocno_class
+
+#undef TARGET_HARD_REGNO_NREGS
+#define TARGET_HARD_REGNO_NREGS loongarch_hard_regno_nregs
+#undef TARGET_HARD_REGNO_MODE_OK
+#define TARGET_HARD_REGNO_MODE_OK loongarch_hard_regno_mode_ok
+
+#undef TARGET_MODES_TIEABLE_P
+#define TARGET_MODES_TIEABLE_P loongarch_modes_tieable_p
+
+#undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS
+#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 2
+
+#undef TARGET_SECONDARY_MEMORY_NEEDED
+#define TARGET_SECONDARY_MEMORY_NEEDED loongarch_secondary_memory_needed
+
+#undef TARGET_CAN_CHANGE_MODE_CLASS
+#define TARGET_CAN_CHANGE_MODE_CLASS loongarch_can_change_mode_class
+
+#undef TARGET_TRULY_NOOP_TRUNCATION
+#define TARGET_TRULY_NOOP_TRUNCATION loongarch_truly_noop_truncation
+
+#undef TARGET_CONSTANT_ALIGNMENT
+#define TARGET_CONSTANT_ALIGNMENT loongarch_constant_alignment
+
+#undef TARGET_STARTING_FRAME_OFFSET
+#define TARGET_STARTING_FRAME_OFFSET loongarch_starting_frame_offset
+
+#undef TARGET_SECONDARY_RELOAD
+#define TARGET_SECONDARY_RELOAD loongarch_secondary_reload
+
+#undef  TARGET_HAVE_SPECULATION_SAFE_VALUE
+#define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
+
+struct gcc_target targetm = TARGET_INITIALIZER;
+
+#include "gt-loongarch.h"
diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
new file mode 100644
index 00000000000..791f9d064c5
--- /dev/null
+++ b/gcc/config/loongarch/loongarch.h
@@ -0,0 +1,1273 @@
+/* Definitions of target machine for GNU compiler.  LoongArch version.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+   Based on MIPS and RISC-V target for GNU compiler.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* LoongArch external variables defined in loongarch.c.  */
+
+#include "config/loongarch/loongarch-opts.h"
+
+/* Macros to silence warnings about numbers being signed in traditional
+   C and unsigned in ISO C when compiled on 32-bit hosts.  */
+
+#define BITMASK_HIGH (((unsigned long) 1) << 31) /* 0x80000000  */
+
+/* Run-time compilation parameters selecting different hardware subsets.  */
+
+/* Target CPU builtins.  */
+#define TARGET_CPU_CPP_BUILTINS() loongarch_cpu_cpp_builtins (pfile)
+
+/* Default target_flags if no switches are specified.  */
+
+#ifdef IN_LIBGCC2
+#undef TARGET_64BIT
+/* Make this compile time constant for libgcc2.  */
+#ifdef __loongarch64
+#define TARGET_64BIT 1
+#else
+#define TARGET_64BIT 0
+#endif
+#endif /* IN_LIBGCC2  */
+
+#define TARGET_LIBGCC_SDATA_SECTION ".sdata"
+
+/* Support for a compile-time default CPU, et cetera.  The rules are:
+   --with-divide is ignored if -mdivide-traps or -mdivide-breaks are
+     specified.  */
+#define OPTION_DEFAULT_SPECS \
+  {"divide", "%{!mdivide-traps:%{!mdivide-breaks:-mdivide-%(VALUE)}}"},
+
+/* Driver native functions for SPEC processing in the GCC driver.  */
+#include "loongarch-driver.h"
+
+/* This definition replaces the formerly used 'm' constraint with a
+   different constraint letter in order to avoid changing semantics of
+   the 'm' constraint when accepting new address formats in
+   TARGET_LEGITIMATE_ADDRESS_P.  The constraint letter defined here
+   must not be used in insn definitions or inline assemblies.  */
+#define TARGET_MEM_CONSTRAINT 'w'
+
+/* Tell collect what flags to pass to nm.  */
+#ifndef NM_FLAGS
+#define NM_FLAGS "-Bn"
+#endif
+
+/* SUBTARGET_ASM_DEBUGGING_SPEC handles passing debugging options to
+   the assembler.  It may be overridden by subtargets.  */
+
+#ifndef SUBTARGET_ASM_DEBUGGING_SPEC
+#define SUBTARGET_ASM_DEBUGGING_SPEC "\
+%{g} %{g0} %{g1} %{g2} %{g3} \
+%{ggdb:-g} %{ggdb0:-g0} %{ggdb1:-g1} %{ggdb2:-g2} %{ggdb3:-g3} \
+%{gstabs:-g} %{gstabs0:-g0} %{gstabs1:-g1} %{gstabs2:-g2} %{gstabs3:-g3} \
+%{gstabs+:-g} %{gstabs+0:-g0} %{gstabs+1:-g1} %{gstabs+2:-g2} %{gstabs+3:-g3}"
+#endif
+
+/* SUBTARGET_ASM_SPEC is always passed to the assembler.  It may be
+   overridden by subtargets.  */
+
+#ifndef SUBTARGET_ASM_SPEC
+#define SUBTARGET_ASM_SPEC ""
+#endif
+
+#undef ASM_SPEC
+#define ASM_SPEC "%{mabi=*} %{subtarget_asm_spec}"
+
+/* Extra switches sometimes passed to the linker.  */
+
+#ifndef LINK_SPEC
+#define LINK_SPEC ""
+#endif /* LINK_SPEC defined  */
+
+/* Specs for the compiler proper.  */
+
+/* CC1_SPEC is the set of arguments to pass to the compiler proper.  */
+
+#undef CC1_SPEC
+#define CC1_SPEC "\
+%{G*} \
+%(subtarget_cc1_spec)"
+
+/* Preprocessor specs.  */
+
+/* SUBTARGET_CPP_SPEC is passed to the preprocessor.  It may be
+   overridden by subtargets.  */
+#ifndef SUBTARGET_CPP_SPEC
+#define SUBTARGET_CPP_SPEC ""
+#endif
+
+#define CPP_SPEC "%(subtarget_cpp_spec)"
+
+/* This macro defines names of additional specifications to put in the specs
+   that can be used in various specifications like CC1_SPEC.  Its definition
+   is an initializer with a subgrouping for each command option.
+
+   Each subgrouping contains a string constant, that defines the
+   specification name, and a string constant that used by the GCC driver
+   program.
+
+   Do not define this macro if it does not need to do anything.  */
+
+#define EXTRA_SPECS \
+  {"subtarget_cc1_spec", SUBTARGET_CC1_SPEC}, \
+  {"subtarget_cpp_spec", SUBTARGET_CPP_SPEC}, \
+  {"subtarget_asm_debugging_spec", SUBTARGET_ASM_DEBUGGING_SPEC}, \
+  {"subtarget_asm_spec", SUBTARGET_ASM_SPEC},
+
+/* Registers may have a prefix which can be ignored when matching
+   user asm and register definitions.  */
+#ifndef REGISTER_PREFIX
+#define REGISTER_PREFIX "$"
+#endif
+
+/* Local compiler-generated symbols must have a prefix that the assembler
+   understands.  */
+
+#define LOCAL_LABEL_PREFIX "."
+
+/* By default on the loongarch, external symbols do not have an underscore
+   prepended.  */
+
+#define USER_LABEL_PREFIX ""
+
+#ifndef PREFERRED_DEBUGGING_TYPE
+#define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG
+#endif
+
+/* The size of DWARF addresses should be the same as the size of symbols
+   in the target file format.  */
+#define DWARF2_ADDR_SIZE (TARGET_64BIT ? 8 : 4)
+
+/* By default, turn on GDB extensions.  */
+#define DEFAULT_GDB_EXTENSIONS 1
+
+/* By default, produce dwarf version 2 format debugging output in response
+   to the ‘-g’ option.  */
+#define DWARF2_DEBUGGING_INFO 1
+
+/* The mapping from gcc register number to DWARF 2 CFA column number.  */
+#define DWARF_FRAME_REGNUM(REGNO) loongarch_dwarf_regno[REGNO]
+
+/* The DWARF 2 CFA column which tracks the return address.  */
+#define DWARF_FRAME_RETURN_COLUMN RETURN_ADDR_REGNUM
+
+/* Before the prologue, RA lives in r1.  */
+#define INCOMING_RETURN_ADDR_RTX gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM)
+
+/* Describe how we implement __builtin_eh_return.  */
+#define EH_RETURN_DATA_REGNO(N) \
+  ((N) < (4) ? (N) + GP_ARG_FIRST : INVALID_REGNUM)
+
+#define EH_RETURN_STACKADJ_RTX gen_rtx_REG (Pmode, GP_ARG_FIRST + 4)
+
+#define EH_USES(N) loongarch_eh_uses (N)
+
+/* Offsets recorded in opcodes are a multiple of this alignment factor.
+   The default for this in 64-bit mode is 8, which causes problems with
+   SFmode register saves.  */
+#define DWARF_CIE_DATA_ALIGNMENT -4
+
+/* Target machine storage layout.  */
+
+#define BITS_BIG_ENDIAN 0
+#define BYTES_BIG_ENDIAN 0
+#define WORDS_BIG_ENDIAN 0
+
+#define MAX_BITS_PER_WORD 64
+
+/* Width of a word, in units (bytes).  */
+#define UNITS_PER_WORD (TARGET_64BIT ? 8 : 4)
+#ifndef IN_LIBGCC2
+#define MIN_UNITS_PER_WORD 4
+#endif
+
+/* For LARCH, width of a floating point register.  */
+#define UNITS_PER_FPREG (TARGET_DOUBLE_FLOAT ? 8 : 4)
+
+/* The number of consecutive floating-point registers needed to store the
+   largest format supported by the FPU.  */
+#define MAX_FPRS_PER_FMT 1
+
+/* The number of consecutive floating-point registers needed to store the
+   smallest format supported by the FPU.  */
+#define MIN_FPRS_PER_FMT 1
+
+/* The largest size of value that can be held in floating-point
+   registers and moved with a single instruction.  */
+#define UNITS_PER_HWFPVALUE \
+  (TARGET_SOFT_FLOAT ? 0 : MAX_FPRS_PER_FMT * UNITS_PER_FPREG)
+
+/* The largest size of value that can be held in floating-point
+   registers.  */
+#define UNITS_PER_FPVALUE \
+  (TARGET_SOFT_FLOAT ? 0 \
+   : TARGET_SINGLE_FLOAT ? UNITS_PER_FPREG \
+			 : LONG_DOUBLE_TYPE_SIZE / BITS_PER_UNIT)
+
+/* The number of bytes in a double.  */
+#define UNITS_PER_DOUBLE (TYPE_PRECISION (double_type_node) / BITS_PER_UNIT)
+
+/* Set the sizes of the core types.  */
+#define SHORT_TYPE_SIZE 16
+#define INT_TYPE_SIZE 32
+#define LONG_TYPE_SIZE (TARGET_64BIT ? 64 : 32)
+#define LONG_LONG_TYPE_SIZE 64
+
+#define FLOAT_TYPE_SIZE 32
+#define DOUBLE_TYPE_SIZE 64
+#define LONG_DOUBLE_TYPE_SIZE (TARGET_64BIT ? 128 : 64)
+
+/* Define the sizes of fixed-point types.  */
+#define SHORT_FRACT_TYPE_SIZE 8
+#define FRACT_TYPE_SIZE 16
+#define LONG_FRACT_TYPE_SIZE 32
+#define LONG_LONG_FRACT_TYPE_SIZE 64
+
+#define SHORT_ACCUM_TYPE_SIZE 16
+#define ACCUM_TYPE_SIZE 32
+#define LONG_ACCUM_TYPE_SIZE 64
+#define LONG_LONG_ACCUM_TYPE_SIZE (TARGET_64BIT ? 128 : 64)
+
+/* long double is not a fixed mode, but the idea is that, if we
+   support long double, we also want a 128-bit integer type.  */
+#define MAX_FIXED_MODE_SIZE LONG_DOUBLE_TYPE_SIZE
+
+/* Width in bits of a pointer.  */
+#ifndef POINTER_SIZE
+#define POINTER_SIZE (TARGET_64BIT ? 64 : 32)
+#endif
+
+/* Allocation boundary (in *bits*) for storing arguments in argument list.  */
+#define PARM_BOUNDARY BITS_PER_WORD
+
+/* Allocation boundary (in *bits*) for the code of a function.  */
+#define FUNCTION_BOUNDARY 32
+
+/* Alignment of field after `int : 0' in a structure.  */
+#define EMPTY_FIELD_BOUNDARY 32
+
+/* Number of bits which any structure or union's size must be a multiple of.
+   Each structure or union's size is rounded up to a multiple of this.  */
+#define STRUCTURE_SIZE_BOUNDARY 8
+
+/* There is no point aligning anything to a rounder boundary than
+   LONG_DOUBLE_TYPE_SIZE.  */
+#define BIGGEST_ALIGNMENT (LONG_DOUBLE_TYPE_SIZE)
+
+/* All accesses must be aligned.  */
+#define STRICT_ALIGNMENT (TARGET_STRICT_ALIGN)
+
+/* Define this if you wish to imitate the way many other C compilers
+   handle alignment of bitfields and the structures that contain
+   them.
+
+   The behavior is that the type written for a bit-field (`int',
+   `short', or other integer type) imposes an alignment for the
+   entire structure, as if the structure really did contain an
+   ordinary field of that type.  In addition, the bit-field is placed
+   within the structure so that it would fit within such a field,
+   not crossing a boundary for it.
+
+   Thus, on most machines, a bit-field whose type is written as `int'
+   would not cross a four-byte boundary, and would force four-byte
+   alignment for the whole structure.  (The alignment used may not
+   be four bytes; it is controlled by the other alignment
+   parameters.)
+
+   If the macro is defined, its definition should be a C expression;
+   a nonzero value for the expression enables this behavior.  */
+
+#define PCC_BITFIELD_TYPE_MATTERS 1
+
+/* If defined, a C expression to compute the alignment for a static
+   variable.  TYPE is the data type, and ALIGN is the alignment that
+   the object would ordinarily have.  The value of this macro is used
+   instead of that alignment to align the object.
+
+   If this macro is not defined, then ALIGN is used.
+
+   One use of this macro is to increase alignment of medium-size
+   data to make it all fit in fewer cache lines.  Another is to
+   cause character arrays to be word-aligned so that `strcpy' calls
+   that copy constants to character arrays can be done inline.  */
+
+#undef DATA_ALIGNMENT
+#define DATA_ALIGNMENT(TYPE, ALIGN)					\
+  ((((ALIGN) < BITS_PER_WORD)						\
+    && (TREE_CODE (TYPE) == ARRAY_TYPE					\
+	|| TREE_CODE (TYPE) == UNION_TYPE				\
+	|| TREE_CODE (TYPE) == RECORD_TYPE)) ? BITS_PER_WORD : (ALIGN))
+
+/* We need this for the same reason as DATA_ALIGNMENT, namely to cause
+   character arrays to be word-aligned so that `strcpy' calls that copy
+   constants to character arrays can be done inline, and 'strcmp' can be
+   optimised to use word loads.  */
+#define LOCAL_ALIGNMENT(TYPE, ALIGN) DATA_ALIGNMENT (TYPE, ALIGN)
+
+/* Define if operations between registers always perform the operation
+   on the full register even if a narrower mode is specified.  */
+#define WORD_REGISTER_OPERATIONS 1
+
+/* When in 64-bit mode, move insns will sign extend SImode and FCCmode
+   moves.  All other references are zero extended.  */
+#define LOAD_EXTEND_OP(MODE) \
+  (TARGET_64BIT && ((MODE) == SImode || (MODE) == FCCmode) ? SIGN_EXTEND \
+							   : ZERO_EXTEND)
+
+/* Define this macro if it is advisable to hold scalars in registers
+   in a wider mode than that declared by the program.  In such cases,
+   the value is constrained to be within the bounds of the declared
+   type, but kept valid in the wider mode.  The signedness of the
+   extension may differ from that of the type.  */
+
+#define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE) \
+  if (GET_MODE_CLASS (MODE) == MODE_INT \
+      && GET_MODE_SIZE (MODE) < UNITS_PER_WORD) \
+    { \
+      if ((MODE) == SImode) \
+	(UNSIGNEDP) = 0; \
+      (MODE) = Pmode; \
+    }
+
+/* Pmode is always the same as ptr_mode, but not always the same as word_mode.
+   Extensions of pointers to word_mode must be signed.  */
+#define POINTERS_EXTEND_UNSIGNED false
+
+/* Define if loading short immediate values into registers sign extends.  */
+#define SHORT_IMMEDIATES_SIGN_EXTEND 1
+
+/* The clz.{w/d} instructions have the natural values at 0.  */
+
+#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
+  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
+
+/* Standard register usage.  */
+
+/* Number of hardware registers.  We have:
+
+   - 32 integer registers
+   - 32 floating point registers
+   - 8 condition code registers
+   - 2 fake registers:
+	- ARG_POINTER_REGNUM
+	- FRAME_POINTER_REGNUM
+*/
+
+#define FIRST_PSEUDO_REGISTER 74
+
+/* zero, tp, sp and x are fixed.  */
+#define FIXED_REGISTERS							\
+{ /* General registers.  */						\
+  1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,			\
+  0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,			\
+  /* Floating-point registers.  */					\
+  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,			\
+  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,			\
+  /* Others.  */							\
+  0, 0, 0, 0, 0, 0, 0, 0, 1, 1}
+
+/* The call RTLs themselves clobber ra.  */
+#define CALL_USED_REGISTERS						\
+{ /* General registers.  */						\
+  1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,			\
+  1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,			\
+  /* Floating-point registers.  */					\
+  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,			\
+  1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0,			\
+  /* Others.  */							\
+  1, 1, 1, 1, 1, 1, 1, 1, 1, 1}
+
+/* Internal macros to classify a register number as to whether it's a
+   general purpose register, a floating point register, or a status
+   register.  */
+
+#define GP_REG_FIRST 0
+#define GP_REG_LAST 31
+#define GP_REG_NUM (GP_REG_LAST - GP_REG_FIRST + 1)
+
+#define FP_REG_FIRST 32
+#define FP_REG_LAST 63
+#define FP_REG_NUM (FP_REG_LAST - FP_REG_FIRST + 1)
+
+/* The DWARF 2 CFA column which tracks the return address from a
+   signal handler context.  This means that to maintain backwards
+   compatibility, no hard register can be assigned this column if it
+   would need to be handled by the DWARF unwinder.  */
+#define DWARF_ALT_FRAME_RETURN_COLUMN 72
+
+#define FCC_REG_FIRST 64
+#define FCC_REG_LAST 71
+#define FCC_REG_NUM (FCC_REG_LAST - FCC_REG_FIRST + 1)
+
+#define GP_REG_P(REGNO) \
+  ((unsigned int) ((int) (REGNO) - GP_REG_FIRST) < GP_REG_NUM)
+#define FP_REG_P(REGNO) \
+  ((unsigned int) ((int) (REGNO) - FP_REG_FIRST) < FP_REG_NUM)
+#define FCC_REG_P(REGNO) \
+  ((unsigned int) ((int) (REGNO) - FCC_REG_FIRST) < FCC_REG_NUM)
+
+#define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X)))
+
+#define HARD_REGNO_RENAME_OK(OLD_REG, NEW_REG) \
+  loongarch_hard_regno_rename_ok (OLD_REG, NEW_REG)
+
+/* Select a register mode required for caller save of hard regno REGNO.  */
+#define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \
+  loongarch_hard_regno_caller_save_mode (REGNO, NREGS, MODE)
+
+/* Register to use for pushing function arguments.  */
+#define STACK_POINTER_REGNUM (GP_REG_FIRST + 3)
+
+/* These two registers don't really exist: they get eliminated to either
+   the stack or hard frame pointer.  */
+#define ARG_POINTER_REGNUM 72
+#define FRAME_POINTER_REGNUM 73
+
+#define HARD_FRAME_POINTER_REGNUM (GP_REG_FIRST + 22)
+
+#define HARD_FRAME_POINTER_IS_FRAME_POINTER 0
+#define HARD_FRAME_POINTER_IS_ARG_POINTER 0
+
+/* Register in which static-chain is passed to a function.  */
+#define STATIC_CHAIN_REGNUM (GP_REG_FIRST + 20) /* $t8  */
+
+#define GP_TEMP_FIRST (GP_REG_FIRST + 12)
+#define LARCH_PROLOGUE_TEMP_REGNUM (GP_TEMP_FIRST + 1)
+#define LARCH_PROLOGUE_TEMP2_REGNUM (GP_TEMP_FIRST)
+#define LARCH_PROLOGUE_TEMP3_REGNUM (GP_TEMP_FIRST + 2)
+#define LARCH_EPILOGUE_TEMP_REGNUM (GP_TEMP_FIRST)
+
+#define CALLEE_SAVED_REG_NUMBER(REGNO) \
+  ((REGNO) >= 22 && (REGNO) <= 31 ? (REGNO) -22 : -1)
+
+#define LARCH_PROLOGUE_TEMP(MODE) \
+  gen_rtx_REG (MODE, LARCH_PROLOGUE_TEMP_REGNUM)
+#define LARCH_PROLOGUE_TEMP2(MODE) \
+  gen_rtx_REG (MODE, LARCH_PROLOGUE_TEMP2_REGNUM)
+#define LARCH_PROLOGUE_TEMP3(MODE) \
+  gen_rtx_REG (MODE, LARCH_PROLOGUE_TEMP3_REGNUM)
+#define LARCH_EPILOGUE_TEMP(MODE) \
+  gen_rtx_REG (MODE, LARCH_EPILOGUE_TEMP_REGNUM)
+
+/* Define this macro if it is as good or better to call a constant
+   function address than to call an address kept in a register.  */
+#define NO_FUNCTION_CSE 1
+
+#define THREAD_POINTER_REGNUM (GP_REG_FIRST + 2)
+
+/* Define the classes of registers for register constraints in the
+   machine description.  Also define ranges of constants.
+
+   One of the classes must always be named ALL_REGS and include all hard regs.
+   If there is more than one class, another class must be named NO_REGS
+   and contain no registers.
+
+   The name GENERAL_REGS must be the name of a class (or an alias for
+   another name such as ALL_REGS).  This is the class of registers
+   that is allowed by "r" in a register constraint.
+   Also, registers outside this class are allocated only when
+   instructions express preferences for them.
+
+   The classes must be numbered in nondecreasing order; that is,
+   a larger-numbered class must never be contained completely
+   in a smaller-numbered class.
+
+   For any two classes, it is very desirable that there be another
+   class that represents their union.  */
+
+enum reg_class
+{
+  NO_REGS,	  /* no registers in set  */
+  SIBCALL_REGS,	  /* SIBCALL_REGS  */
+  JIRL_REGS,	  /* JIRL_REGS  */
+  GR_REGS,	  /* integer registers  */
+  CSR_REGS,	  /* integer registers except for $r0 and $r1 for lcsr.  */
+  FP_REGS,	  /* floating point registers  */
+  FCC_REGS,	  /* status registers (fp status)  */
+  FRAME_REGS,	  /* arg pointer and frame pointer  */
+  ALL_REGS,	  /* all registers  */
+  LIM_REG_CLASSES /* max value + 1  */
+};
+
+#define N_REG_CLASSES (int) LIM_REG_CLASSES
+
+#define GENERAL_REGS GR_REGS
+
+/* An initializer containing the names of the register classes as C
+   string constants.  These names are used in writing some of the
+   debugging dumps.  */
+
+#define REG_CLASS_NAMES							\
+{									\
+  "NO_REGS",								\
+  "SIBCALL_REGS",							\
+  "JIRL_REGS",								\
+  "GR_REGS",								\
+  "CSR_REGS",								\
+  "FP_REGS",								\
+  "FCC_REGS",								\
+  "FRAME_REGS",								\
+  "ALL_REGS"								\
+}
+
+/* An initializer containing the contents of the register classes,
+   as integers which are bit masks.  The Nth integer specifies the
+   contents of class N.  The way the integer MASK is interpreted is
+   that register R is in the class if `MASK & (1 << R)' is 1.
+
+   When the machine has more than 32 registers, an integer does not
+   suffice.  Then the integers are replaced by sub-initializers,
+   braced groupings containing several integers.  Each
+   sub-initializer must be suitable as an initializer for the type
+   `HARD_REG_SET' which is defined in `hard-reg-set.h'.  */
+
+#define REG_CLASS_CONTENTS						\
+{									\
+  { 0x00000000, 0x00000000, 0x00000000 },	/* NO_REGS  */		\
+  { 0x001ff000, 0x00000000, 0x00000000 },	/* SIBCALL_REGS  */	\
+  { 0xff9ffff0, 0x00000000, 0x00000000 },	/* JIRL_REGS  */	\
+  { 0xffffffff, 0x00000000, 0x00000000 },	/* GR_REGS  */		\
+  { 0xfffffffc, 0x00000000, 0x00000000 },	/* CSR_REGS  */		\
+  { 0x00000000, 0xffffffff, 0x00000000 },	/* FP_REGS  */		\
+  { 0x00000000, 0x00000000, 0x000000ff },	/* FCC_REGS  */		\
+  { 0x00000000, 0x00000000, 0x00000300 },	/* FRAME_REGS  */	\
+  { 0xffffffff, 0xffffffff, 0x000003ff }	/* ALL_REGS  */		\
+}
+
+/* A C expression whose value is a register class containing hard
+   register REGNO.  In general there is more that one such class;
+   choose a class which is "minimal", meaning that no smaller class
+   also contains the register.  */
+
+#define REGNO_REG_CLASS(REGNO) loongarch_regno_to_class[(REGNO)]
+
+/* A macro whose definition is the name of the class to which a
+   valid base register must belong.  A base register is one used in
+   an address which is the register value plus a displacement.  */
+
+#define BASE_REG_CLASS (GR_REGS)
+
+/* A macro whose definition is the name of the class to which a
+   valid index register must belong.  An index register is one used
+   in an address where its value is either multiplied by a scale
+   factor or added to another register (as well as added to a
+   displacement).  */
+
+#define INDEX_REG_CLASS GR_REGS
+
+/* We generally want to put call-clobbered registers ahead of
+   call-saved ones.  (IRA expects this.)  */
+
+#define REG_ALLOC_ORDER							\
+{ /* Call-clobbered GPRs.  */						\
+  12, 13, 14, 15, 16, 17, 18, 19, 20, 4, 5, 6, 7, 8, 9, 10, 11, 1,	\
+  /* Call-saved GPRs.  */						\
+  23, 24, 25, 26, 27, 28, 29, 30, 31,					\
+  /* GPRs that can never be exposed to the register allocator.  */	\
+  0, 2, 3, 21, 22, 							\
+  /* Call-clobbered FPRs.  */						\
+  32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,	\
+  48, 49, 50, 51,52, 53, 54, 55, 					\
+  56, 57, 58, 59, 60, 61, 62, 63,					\
+  /* None of the remaining classes have defined call-saved		\
+     registers.  */							\
+  64, 65, 66, 67, 68, 69, 70, 71, 72, 73}
+
+#define IMM_BITS 12
+#define IMM_REACH (1LL << IMM_BITS)
+
+/* True if VALUE is an unsigned 6-bit number.  */
+
+#define UIMM6_OPERAND(VALUE) (((VALUE) & ~(unsigned HOST_WIDE_INT) 0x3f) == 0)
+
+/* True if VALUE is a signed 10-bit number.  */
+
+#define IMM10_OPERAND(VALUE) ((unsigned HOST_WIDE_INT) (VALUE) + 0x200 < 0x400)
+
+/* True if VALUE is a signed 12-bit number.  */
+
+#define IMM12_OPERAND(VALUE) \
+  ((unsigned HOST_WIDE_INT) (VALUE) + IMM_REACH / 2 < IMM_REACH)
+
+/* True if VALUE is a signed 16-bit number.  */
+
+#define IMM16_OPERAND(VALUE) \
+  ((unsigned HOST_WIDE_INT) (VALUE) + 0x8000 < 0x10000)
+
+/* True if VALUE is an unsigned 12-bit number.  */
+
+#define IMM12_OPERAND_UNSIGNED(VALUE) \
+  (((VALUE) & ~(unsigned HOST_WIDE_INT) (IMM_REACH - 1)) == 0)
+
+/* True if VALUE can be loaded into a register using LU12I.  */
+
+#define LU12I_OPERAND(VALUE) \
+  (((VALUE) | ((1UL << 31) - IMM_REACH)) == ((1UL << 31) - IMM_REACH) \
+   || ((VALUE) | ((1UL << 31) - IMM_REACH)) + IMM_REACH == 0)
+
+/* True if VALUE can be loaded into a register using LU32I.  */
+
+#define LU32I_OPERAND(VALUE) \
+  (((VALUE) | (((1ULL << 19) - 1) << 32)) == (((1ULL << 19) - 1) << 32) \
+   || ((VALUE) | (((1ULL << 19) - 1) << 32)) + (1ULL << 32) == 0)
+
+/* True if VALUE can be loaded into a register using LU52I.  */
+
+#define LU52I_OPERAND(VALUE) (((VALUE) | (0xfffULL << 52)) == (0xfffULL << 52))
+
+/* Return a value X with the low 12 bits clear, and such that
+   VALUE - X is a signed 12-bit value.  */
+
+#define CONST_HIGH_PART(VALUE) (((VALUE) + (IMM_REACH / 2)) & ~(IMM_REACH - 1))
+
+#define CONST_LOW_PART(VALUE) ((VALUE) -CONST_HIGH_PART (VALUE))
+
+#define IMM12_INT(X) IMM12_OPERAND (INTVAL (X))
+#define IMM12_INT_UNSIGNED(X) IMM12_OPERAND_UNSIGNED (INTVAL (X))
+#define LU12I_INT(X) LU12I_OPERAND (INTVAL (X))
+#define LU32I_INT(X) LU32I_OPERAND (INTVAL (X))
+#define LU52I_INT(X) LU52I_OPERAND (INTVAL (X))
+#define LARCH_U12BIT_OFFSET_P(OFFSET) (IN_RANGE (OFFSET, -2048, 2047))
+#define LARCH_9BIT_OFFSET_P(OFFSET) (IN_RANGE (OFFSET, -256, 255))
+#define LARCH_16BIT_OFFSET_P(OFFSET) (IN_RANGE (OFFSET, -32768, 32767))
+#define LARCH_SHIFT_2_OFFSET_P(OFFSET) (((OFFSET) & 0x3) == 0)
+
+/* Return the maximum number of consecutive registers
+   needed to represent mode MODE in a register of class CLASS.  */
+
+#define CLASS_MAX_NREGS(CLASS, MODE) loongarch_class_max_nregs (CLASS, MODE)
+
+/* Stack layout; function entry, exit and calling.  */
+
+#define STACK_GROWS_DOWNWARD 1
+
+#define FRAME_GROWS_DOWNWARD 1
+
+#define RETURN_ADDR_RTX loongarch_return_addr
+
+/* Similarly, don't use the least-significant bit to tell pointers to
+   code from vtable index.  */
+
+#define TARGET_PTRMEMFUNC_VBIT_LOCATION ptrmemfunc_vbit_in_delta
+
+#define ELIMINABLE_REGS \
+  { \
+    {ARG_POINTER_REGNUM, STACK_POINTER_REGNUM}, \
+      {ARG_POINTER_REGNUM, HARD_FRAME_POINTER_REGNUM}, \
+      {FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM}, \
+      {FRAME_POINTER_REGNUM, HARD_FRAME_POINTER_REGNUM}, \
+  }
+
+#define INITIAL_ELIMINATION_OFFSET(FROM, TO, OFFSET) \
+  (OFFSET) = loongarch_initial_elimination_offset ((FROM), (TO))
+
+/* Allocate stack space for arguments at the beginning of each function.  */
+#define ACCUMULATE_OUTGOING_ARGS 1
+
+/* The argument pointer always points to the first argument.  */
+#define FIRST_PARM_OFFSET(FNDECL) 0
+
+#define REG_PARM_STACK_SPACE(FNDECL) 0
+
+/* Define this if it is the responsibility of the caller to
+   allocate the area reserved for arguments passed in registers.
+   If `ACCUMULATE_OUTGOING_ARGS' is also defined, the only effect
+   of this macro is to determine whether the space is included in
+   `crtl->outgoing_args_size'.  */
+#define OUTGOING_REG_PARM_STACK_SPACE(FNTYPE) 1
+
+#define STACK_BOUNDARY (TARGET_ABI_LP64 ? 128 : 64)
+
+/* Symbolic macros for the registers used to return integer and floating
+   point values.  */
+
+#define GP_RETURN (GP_REG_FIRST + 4)
+#define FP_RETURN ((TARGET_SOFT_FLOAT) ? GP_RETURN : (FP_REG_FIRST + 0))
+
+#define MAX_ARGS_IN_REGISTERS 8
+
+/* Symbolic macros for the first/last argument registers.  */
+
+#define GP_ARG_FIRST (GP_REG_FIRST + 4)
+#define GP_ARG_LAST (GP_ARG_FIRST + MAX_ARGS_IN_REGISTERS - 1)
+#define FP_ARG_FIRST (FP_REG_FIRST + 0)
+#define FP_ARG_LAST (FP_ARG_FIRST + MAX_ARGS_IN_REGISTERS - 1)
+
+/* 1 if N is a possible register number for function argument passing.
+   We have no FP argument registers when soft-float.  */
+
+/* Accept arguments in a0-a7, and in fa0-fa7 if permitted by the ABI.  */
+#define FUNCTION_ARG_REGNO_P(N) \
+  (IN_RANGE ((N), GP_ARG_FIRST, GP_ARG_LAST) \
+   || (UNITS_PER_FP_ARG && IN_RANGE ((N), FP_ARG_FIRST, FP_ARG_LAST)))
+
+/* This structure has to cope with two different argument allocation
+   schemes.  Most LoongArch ABIs view the arguments as a structure, of which
+   the first N words go in registers and the rest go on the stack.  If I
+   < N, the Ith word might go in Ith integer argument register or in a
+   floating-point register.  For these ABIs, we only need to remember
+   the offset of the current argument into the structure.
+
+   So for the standard ABIs, the first N words are allocated to integer
+   registers, and loongarch_function_arg decides on an argument-by-argument
+   basis whether that argument should really go in an integer register,
+   or in a floating-point one.  */
+
+typedef struct loongarch_args
+{
+  /* Number of integer registers used so far, up to MAX_ARGS_IN_REGISTERS.  */
+  unsigned int num_gprs;
+
+  /* Number of floating-point registers used so far, likewise.  */
+  unsigned int num_fprs;
+
+} CUMULATIVE_ARGS;
+
+/* Initialize a variable CUM of type CUMULATIVE_ARGS
+   for a call to a function whose data type is FNTYPE.
+   For a library call, FNTYPE is 0.  */
+
+#define INIT_CUMULATIVE_ARGS(CUM, FNTYPE, LIBNAME, INDIRECT, N_NAMED_ARGS) \
+  memset (&(CUM), 0, sizeof (CUM))
+
+#define EPILOGUE_USES(REGNO) loongarch_epilogue_uses (REGNO)
+
+/* Treat LOC as a byte offset from the stack pointer and round it up
+   to the next fully-aligned offset.  */
+#define LARCH_STACK_ALIGN(LOC) \
+  (TARGET_ABI_LP64 ? ROUND_UP ((LOC), 16) : ROUND_UP ((LOC), 8))
+
+/* Output assembler code to FILE to increment profiler label # LABELNO
+   for profiling a function entry.  */
+
+#define MCOUNT_NAME "_mcount"
+
+/* Emit rtl for profiling.  Output assembler code to FILE
+   to call "_mcount" for profiling a function entry.  */
+#define PROFILE_HOOK(LABEL) \
+  { \
+    rtx fun, ra; \
+    ra = get_hard_reg_initial_val (Pmode, RETURN_ADDR_REGNUM); \
+    fun = gen_rtx_SYMBOL_REF (Pmode, MCOUNT_NAME); \
+    emit_library_call (fun, LCT_NORMAL, VOIDmode, ra, Pmode); \
+  }
+
+/* All the work done in PROFILE_HOOK, but still required.  */
+#define FUNCTION_PROFILER(STREAM, LABELNO) do { } while (0)
+
+#define NO_PROFILE_COUNTERS 1
+
+/* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
+   the stack pointer does not matter.  The value is tested only in
+   functions that have frame pointers.
+   No definition is equivalent to always zero.  */
+
+#define EXIT_IGNORE_STACK 1
+
+/* Trampolines are a block of code followed by two pointers.  */
+
+#define TRAMPOLINE_CODE_SIZE 16
+#define TRAMPOLINE_SIZE \
+  ((Pmode == SImode) ? TRAMPOLINE_CODE_SIZE \
+		     : (TRAMPOLINE_CODE_SIZE + POINTER_SIZE * 2))
+#define TRAMPOLINE_ALIGNMENT POINTER_SIZE
+
+/* loongarch_trampoline_init calls this library function to flush
+   program and data caches.  */
+
+#ifndef CACHE_FLUSH_FUNC
+#define CACHE_FLUSH_FUNC "_flush_cache"
+#endif
+
+/* Addressing modes, and classification of registers for them.  */
+
+#define REGNO_OK_FOR_INDEX_P(REGNO) \
+  loongarch_regno_mode_ok_for_base_p (REGNO, VOIDmode, 1)
+
+#define REGNO_MODE_OK_FOR_BASE_P(REGNO, MODE) \
+  loongarch_regno_mode_ok_for_base_p (REGNO, MODE, 1)
+
+/* Maximum number of registers that can appear in a valid memory address.  */
+
+#define MAX_REGS_PER_ADDRESS 2
+
+/* Check for constness inline but use loongarch_legitimate_address_p
+   to check whether a constant really is an address.  */
+
+#define CONSTANT_ADDRESS_P(X) (CONSTANT_P (X) && memory_address_p (SImode, X))
+
+/* This handles the magic '..CURRENT_FUNCTION' symbol, which means
+   'the start of the function that this code is output in'.  */
+
+#define ASM_OUTPUT_LABELREF(FILE, NAME) \
+  do \
+    { \
+      if (strcmp (NAME, "..CURRENT_FUNCTION") == 0) \
+	asm_fprintf ((FILE), "%U%s", \
+		     XSTR (XEXP (DECL_RTL (current_function_decl), 0), 0)); \
+      else \
+	asm_fprintf ((FILE), "%U%s", (NAME)); \
+    } \
+  while (0)
+
+#define CASE_VECTOR_MODE Pmode
+
+/* Only use short offsets if their range will not overflow.  */
+#define CASE_VECTOR_SHORTEN_MODE(MIN, MAX, BODY) Pmode
+
+/* Define this as 1 if `char' should by default be signed; else as 0.  */
+#ifndef DEFAULT_SIGNED_CHAR
+#define DEFAULT_SIGNED_CHAR 1
+#endif
+
+/* The SPARC port says:
+   The maximum number of bytes that a single instruction
+   can move quickly between memory and registers or between
+   two memory locations.  */
+#define MOVE_MAX UNITS_PER_WORD
+#define MAX_MOVE_MAX 8
+
+/* The SPARC port says:
+   Nonzero if access to memory by bytes is slow and undesirable.
+   For RISC chips, it means that access to memory by bytes is no
+   better than access by words when possible, so grab a whole word
+   and maybe make use of that.  */
+#define SLOW_BYTE_ACCESS 1
+
+/* Standard LoongArch integer shifts truncate the shift amount to the
+   width of the shifted operand.  */
+#define SHIFT_COUNT_TRUNCATED 1
+
+/* Specify the machine mode that pointers have.
+   After generation of rtl, the compiler makes no further distinction
+   between pointers and any other objects of this machine mode.  */
+
+#ifndef Pmode
+#define Pmode (TARGET_64BIT ? DImode : SImode)
+#endif
+
+/* Give call MEMs SImode since it is the "most permissive" mode
+   for both 32-bit and 64-bit targets.  */
+
+#define FUNCTION_MODE SImode
+
+/* We allocate $fcc registers by hand and can't cope with moves of
+   CCmode registers to and from pseudos (or memory).  */
+#define AVOID_CCMODE_COPIES
+
+/* A C expression for the cost of a branch instruction.  A value of
+   1 is the default; other values are interpreted relative to that.  */
+
+#define BRANCH_COST(speed_p, predictable_p) loongarch_branch_cost
+#define LOGICAL_OP_NON_SHORT_CIRCUIT 0
+
+/* Return the asm template for a conditional branch instruction.
+   OPCODE is the opcode's mnemonic and OPERANDS is the asm template for
+   its operands.  */
+#define LARCH_BRANCH(OPCODE, OPERANDS) OPCODE "\t" OPERANDS
+
+/* Control the assembler format that we output.  */
+
+/* Output to assembler file text saying following lines
+   may contain character constants, extra white space, comments, etc.  */
+
+#ifndef ASM_APP_ON
+#define ASM_APP_ON " #APP\n"
+#endif
+
+/* Output to assembler file text saying following lines
+   no longer contain unusual constructs.  */
+
+#ifndef ASM_APP_OFF
+#define ASM_APP_OFF " #NO_APP\n"
+#endif
+
+#define REGISTER_NAMES							  \
+{ "$r0",   "$r1",   "$r2",   "$r3",   "$r4",   "$r5",   "$r6",   "$r7",   \
+  "$r8",   "$r9",   "$r10",  "$r11",  "$r12",  "$r13",  "$r14",  "$r15",  \
+  "$r16",  "$r17",  "$r18",  "$r19",  "$r20",  "$r21",  "$r22",  "$r23",  \
+  "$r24",  "$r25",  "$r26",  "$r27",  "$r28",  "$r29",  "$r30",  "$r31",  \
+  "$f0",  "$f1",  "$f2",  "$f3",  "$f4",  "$f5",  "$f6",  "$f7",	  \
+  "$f8",  "$f9",  "$f10", "$f11", "$f12", "$f13", "$f14", "$f15",	  \
+  "$f16", "$f17", "$f18", "$f19", "$f20", "$f21", "$f22", "$f23",	  \
+  "$f24", "$f25", "$f26", "$f27", "$f28", "$f29", "$f30", "$f31",	  \
+  "$fcc0","$fcc1","$fcc2","$fcc3","$fcc4","$fcc5","$fcc6","$fcc7",	  \
+  "$arg", "$frame"}
+
+/* This macro defines additional names for hard registers.  */
+
+#define ADDITIONAL_REGISTER_NAMES					\
+{									\
+  { "zero",	 0 + GP_REG_FIRST },					\
+  { "ra",	 1 + GP_REG_FIRST },					\
+  { "tp",	 2 + GP_REG_FIRST },					\
+  { "sp",	 3 + GP_REG_FIRST },					\
+  { "a0",	 4 + GP_REG_FIRST },					\
+  { "a1",	 5 + GP_REG_FIRST },					\
+  { "a2",	 6 + GP_REG_FIRST },					\
+  { "a3",	 7 + GP_REG_FIRST },					\
+  { "a4",	 8 + GP_REG_FIRST },					\
+  { "a5",	 9 + GP_REG_FIRST },					\
+  { "a6",	10 + GP_REG_FIRST },					\
+  { "a7",	11 + GP_REG_FIRST },					\
+  { "t0",	12 + GP_REG_FIRST },					\
+  { "t1",	13 + GP_REG_FIRST },					\
+  { "t2",	14 + GP_REG_FIRST },					\
+  { "t3",	15 + GP_REG_FIRST },					\
+  { "t4",	16 + GP_REG_FIRST },					\
+  { "t5",	17 + GP_REG_FIRST },					\
+  { "t6",	18 + GP_REG_FIRST },					\
+  { "t7",	19 + GP_REG_FIRST },					\
+  { "t8",	20 + GP_REG_FIRST },					\
+  { "x",	21 + GP_REG_FIRST },					\
+  { "fp",	22 + GP_REG_FIRST },					\
+  { "s0",	23 + GP_REG_FIRST },					\
+  { "s1",	24 + GP_REG_FIRST },					\
+  { "s2",	25 + GP_REG_FIRST },					\
+  { "s3",	26 + GP_REG_FIRST },					\
+  { "s4",	27 + GP_REG_FIRST },					\
+  { "s5",	28 + GP_REG_FIRST },					\
+  { "s6",	29 + GP_REG_FIRST },					\
+  { "s7",	30 + GP_REG_FIRST },					\
+  { "s8",	31 + GP_REG_FIRST },					\
+  { "v0",	 4 + GP_REG_FIRST },					\
+  { "v1",	 5 + GP_REG_FIRST }					\
+}
+
+/* The LoongArch implementation uses some labels for its own purpose.  The
+   following lists what labels are created, and are all formed by the
+   pattern $L[a-z].*.  The machine independent portion of GCC creates
+   labels matching:  $L[A-Z][0-9]+ and $L[0-9]+.
+
+	LM[0-9]+	Silicon Graphics/ECOFF stabs label before each stmt.
+	$Lb[0-9]+	Begin blocks for LoongArch debug support
+	$Lc[0-9]+	Label for use in s<xx> operation.
+	$Le[0-9]+	End blocks for LoongArch debug support.  */
+
+#undef ASM_DECLARE_OBJECT_NAME
+#define ASM_DECLARE_OBJECT_NAME(STREAM, NAME, DECL) \
+  loongarch_declare_object (STREAM, NAME, "", ":\n")
+
+/* Globalizing directive for a label.  */
+#define GLOBAL_ASM_OP "\t.globl\t"
+
+/* This says how to define a global common symbol.  */
+
+#define ASM_OUTPUT_ALIGNED_DECL_COMMON loongarch_output_aligned_decl_common
+
+/* This says how to define a local common symbol (i.e., not visible to
+   linker).  */
+
+#ifndef ASM_OUTPUT_ALIGNED_LOCAL
+#define ASM_OUTPUT_ALIGNED_LOCAL(STREAM, NAME, SIZE, ALIGN) \
+  loongarch_declare_common_object (STREAM, NAME, "\n\t.lcomm\t", SIZE, ALIGN, \
+				   false)
+#endif
+
+/* This says how to output an external.  It would be possible not to
+   output anything and let undefined symbol become external.  However
+   the assembler uses length information on externals to allocate in
+   data/sdata bss/sbss, thereby saving exec time.  */
+
+#undef ASM_OUTPUT_EXTERNAL
+#define ASM_OUTPUT_EXTERNAL(STREAM, DECL, NAME) \
+  loongarch_output_external (STREAM, DECL, NAME)
+
+/* This is how to declare a function name.  The actual work of
+   emitting the label is moved to function_prologue, so that we can
+   get the line number correctly emitted before the .ent directive,
+   and after any .file directives.  Define as empty so that the function
+   is not declared before the .ent directive elsewhere.  */
+
+#undef ASM_DECLARE_FUNCTION_NAME
+#define ASM_DECLARE_FUNCTION_NAME(STREAM, NAME, DECL) \
+  loongarch_declare_function_name (STREAM, NAME, DECL)
+
+/* This is how to store into the string LABEL
+   the symbol_ref name of an internal numbered label where
+   PREFIX is the class of label and NUM is the number within the class.
+   This is suitable for output with `assemble_name'.  */
+
+#undef ASM_GENERATE_INTERNAL_LABEL
+#define ASM_GENERATE_INTERNAL_LABEL(LABEL, PREFIX, NUM) \
+  sprintf ((LABEL), "*%s%s%ld", (LOCAL_LABEL_PREFIX), (PREFIX), (long) (NUM))
+
+/* Print debug labels as "foo = ." rather than "foo:" because they should
+   represent a byte pointer rather than an ISA-encoded address.  This is
+   particularly important for code like:
+
+	$LFBxxx = .
+		.cfi_startproc
+		...
+		.section .gcc_except_table,...
+		...
+		.uleb128 foo-$LFBxxx
+
+   The .uleb128 requies $LFBxxx to match the FDE start address, which is
+   likewise a byte pointer rather than an ISA-encoded address.
+
+   At the time of writing, this hook is not used for the function end
+   label:
+
+	$LFExxx:
+		.end foo
+
+   */
+
+#define ASM_OUTPUT_DEBUG_LABEL(FILE, PREFIX, NUM) \
+  fprintf (FILE, "%s%s%d = .\n", LOCAL_LABEL_PREFIX, PREFIX, NUM)
+
+/* This is how to output an element of a case-vector that is absolute.  */
+
+#define ASM_OUTPUT_ADDR_VEC_ELT(STREAM, VALUE) \
+  fprintf (STREAM, "\t%s\t%sL%d\n", ptr_mode == DImode ? ".dword" : ".word", \
+	   LOCAL_LABEL_PREFIX, VALUE)
+
+/* This is how to output an element of a case-vector.  */
+
+#define ASM_OUTPUT_ADDR_DIFF_ELT(STREAM, BODY, VALUE, REL) \
+  do \
+    { \
+      fprintf (STREAM, "\t%s\t%sL%d-%sL%d\n", \
+	       ptr_mode == DImode ? ".dword" : ".word", LOCAL_LABEL_PREFIX, \
+	       VALUE, LOCAL_LABEL_PREFIX, REL); \
+    } \
+  while (0)
+
+#define JUMP_TABLES_IN_TEXT_SECTION 0
+
+/* This is how to output an assembler line
+   that says to advance the location counter
+   to a multiple of 2**LOG bytes.  */
+
+#define ASM_OUTPUT_ALIGN(STREAM, LOG) fprintf (STREAM, "\t.align\t%d\n", (LOG))
+
+/* "nop" instruction 54525952 (andi $r0,$r0,0) is
+   used for padding.  */
+#define ASM_OUTPUT_ALIGN_WITH_NOP(STREAM, LOG) \
+  fprintf (STREAM, "\t.align\t%d,54525952,4\n", (LOG))
+
+/* This is how to output an assembler line to advance the location
+   counter by SIZE bytes.  */
+
+#undef ASM_OUTPUT_SKIP
+#define ASM_OUTPUT_SKIP(STREAM, SIZE) \
+  fprintf (STREAM, "\t.space\t" HOST_WIDE_INT_PRINT_UNSIGNED "\n", (SIZE))
+
+/* This is how to output a string.  */
+#undef ASM_OUTPUT_ASCII
+#define ASM_OUTPUT_ASCII loongarch_output_ascii
+
+/* Define the strings to put out for each section in the object file.  */
+#define TEXT_SECTION_ASM_OP "\t.text" /* instructions  */
+#define DATA_SECTION_ASM_OP "\t.data" /* large data  */
+
+#undef READONLY_DATA_SECTION_ASM_OP
+#define READONLY_DATA_SECTION_ASM_OP "\t.section\t.rodata" /* read-only data */
+
+#define ASM_OUTPUT_REG_PUSH(STREAM, REGNO)	\
+  do \
+    { \
+      fprintf (STREAM, "\t%s\t%s,%s,-8\n\t%s\t%s,%s,0\n", \
+	       TARGET_64BIT ? "addi.d" : "addi.w", \
+	       reg_names[STACK_POINTER_REGNUM], \
+	       reg_names[STACK_POINTER_REGNUM], \
+	       TARGET_64BIT ? "st.d" : "st.w", reg_names[REGNO], \
+	       reg_names[STACK_POINTER_REGNUM]); \
+    } \
+  while (0)
+
+#define ASM_OUTPUT_REG_POP(STREAM, REGNO) \
+  do \
+    { \
+      fprintf (STREAM, "\t%s\t%s,%s,0\n\t%s\t%s,%s,8\n", \
+	       TARGET_64BIT ? "ld.d" : "ld.w", reg_names[REGNO], \
+	       reg_names[STACK_POINTER_REGNUM], \
+	       TARGET_64BIT ? "addi.d" : "addi.w", \
+	       reg_names[STACK_POINTER_REGNUM], \
+	       reg_names[STACK_POINTER_REGNUM]); \
+    } \
+  while (0)
+
+/* How to start an assembler comment.
+   The leading space is important (the loongarch native assembler requires it).
+ */
+#ifndef ASM_COMMENT_START
+#define ASM_COMMENT_START " #"
+#endif
+
+#undef SIZE_TYPE
+#define SIZE_TYPE (POINTER_SIZE == 64 ? "long unsigned int" : "unsigned int")
+
+#undef PTRDIFF_TYPE
+#define PTRDIFF_TYPE (POINTER_SIZE == 64 ? "long int" : "int")
+
+/* The maximum number of bytes that can be copied by one iteration of
+   a cpymemsi loop; see loongarch_block_move_loop.  */
+#define LARCH_MAX_MOVE_BYTES_PER_LOOP_ITER (UNITS_PER_WORD * 4)
+
+/* The maximum number of bytes that can be copied by a straight-line
+   implementation of cpymemsi; see loongarch_block_move_straight.  We want
+   to make sure that any loop-based implementation will iterate at
+   least twice.  */
+#define LARCH_MAX_MOVE_BYTES_STRAIGHT (LARCH_MAX_MOVE_BYTES_PER_LOOP_ITER * 2)
+
+/* The base cost of a memcpy call, for MOVE_RATIO and friends.  These
+   values were determined experimentally by benchmarking with CSiBE.
+*/
+#define LARCH_CALL_RATIO 8
+
+/* Any loop-based implementation of cpymemsi will have at least
+   LARCH_MAX_MOVE_BYTES_STRAIGHT / UNITS_PER_WORD memory-to-memory
+   moves, so allow individual copies of fewer elements.
+
+   When cpymemsi is not available, use a value approximating
+   the length of a memcpy call sequence, so that move_by_pieces
+   will generate inline code if it is shorter than a function call.
+   Since move_by_pieces_ninsns counts memory-to-memory moves, but
+   we'll have to generate a load/store pair for each, halve the
+   value of LARCH_CALL_RATIO to take that into account.  */
+
+#define MOVE_RATIO(speed) \
+  (HAVE_cpymemsi \
+   ? LARCH_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD \
+   : CLEAR_RATIO (speed) / 2)
+
+/* For CLEAR_RATIO, when optimizing for size, give a better estimate
+   of the length of a memset call, but use the default otherwise.  */
+
+#define CLEAR_RATIO(speed) ((speed) ? 15 : LARCH_CALL_RATIO)
+
+/* This is similar to CLEAR_RATIO, but for a non-zero constant, so when
+   optimizing for size adjust the ratio to account for the overhead of
+   loading the constant and replicating it across the word.  */
+
+#define SET_RATIO(speed) ((speed) ? 15 : LARCH_CALL_RATIO - 2)
+
+#ifndef USED_FOR_TARGET
+extern const enum reg_class loongarch_regno_to_class[];
+extern int loongarch_dwarf_regno[];
+
+/* Information about a function's frame layout.  */
+struct GTY (()) loongarch_frame_info
+{
+  /* The size of the frame in bytes.  */
+  HOST_WIDE_INT total_size;
+
+  /* Bit X is set if the function saves or restores GPR X.  */
+  unsigned int mask;
+
+  /* Likewise FPR X.  */
+  unsigned int fmask;
+
+  /* How much the GPR save/restore routines adjust sp (or 0 if unused).  */
+  unsigned save_libcall_adjustment;
+
+  /* Offsets of fixed-point and floating-point save areas from frame
+     bottom.  */
+  HOST_WIDE_INT gp_sp_offset;
+  HOST_WIDE_INT fp_sp_offset;
+
+  /* Offset of virtual frame pointer from stack pointer/frame bottom.  */
+  HOST_WIDE_INT frame_pointer_offset;
+
+  /* Offset of hard frame pointer from stack pointer/frame bottom.  */
+  HOST_WIDE_INT hard_frame_pointer_offset;
+
+  /* The offset of arg_pointer_rtx from the bottom of the frame.  */
+  HOST_WIDE_INT arg_pointer_offset;
+};
+
+struct GTY (()) machine_function
+{
+  /* The next floating-point condition-code register to allocate
+     for 8CC targets, relative to FCC_REG_FIRST.  */
+  unsigned int next_fcc;
+
+  /* The number of extra stack bytes taken up by register varargs.
+     This area is allocated by the callee at the very top of the frame.  */
+  int varargs_size;
+
+  /* The current frame information, calculated by loongarch_compute_frame_info.
+   */
+  struct loongarch_frame_info frame;
+
+  /* True if at least one of the formal parameters to a function must be
+     written to the frame header (probably so its address can be taken).  */
+  bool does_not_use_frame_header;
+
+  /* True if none of the functions that are called by this function need
+     stack space allocated for their arguments.  */
+  bool optimize_call_stack;
+
+  /* True if one of the functions calling this function may not allocate
+     a frame header.  */
+  bool callers_may_not_allocate_frame;
+};
+#endif
+
+/* As on most targets, we want the .eh_frame section to be read-only where
+   possible.  And as on most targets, this means two things:
+
+     (a) Non-locally-binding pointers must have an indirect encoding,
+	 so that the addresses in the .eh_frame section itself become
+	 locally-binding.
+
+     (b) A shared library's .eh_frame section must encode locally-binding
+	 pointers in a relative (relocation-free) form.
+
+   However, LoongArch has traditionally not allowed directives like:
+
+	.long	x-.
+
+   in cases where "x" is in a different section, or is not defined in the
+   same assembly file.  We are therefore unable to emit the PC-relative
+   form required by (b) at assembly time.
+
+   Fortunately, the linker is able to convert absolute addresses into
+   PC-relative addresses on our behalf.  Unfortunately, only certain
+   versions of the linker know how to do this for indirect pointers,
+   and for personality data.  We must fall back on using writable
+   .eh_frame sections for shared libraries if the linker does not
+   support this feature.  */
+#define ASM_PREFERRED_EH_DATA_FORMAT(CODE, GLOBAL) \
+  (((GLOBAL) ? DW_EH_PE_indirect : 0) | DW_EH_PE_absptr)
+
+/* Several named LoongArch patterns depend on Pmode.  These patterns have the
+   form <NAME>_si for Pmode == SImode and <NAME>_di for Pmode == DImode.
+   Add the appropriate suffix to generator function NAME and invoke it
+   with arguments ARGS.  */
+#define PMODE_INSN(NAME, ARGS) \
+  (Pmode == SImode ? NAME##_si ARGS : NAME##_di ARGS)
+
+/* Do emit .note.GNU-stack by default.  */
+#ifndef NEED_INDICATE_EXEC_STACK
+#define NEED_INDICATE_EXEC_STACK 1
+#endif
+
+/* The `Q' extension is not yet supported.  */
+/* TODO: according to march.  */
+#define UNITS_PER_FP_REG (TARGET_DOUBLE_FLOAT ? 8 : 4)
+
+/* The largest type that can be passed in floating-point registers.  */
+/* TODO: according to mabi.  */
+#define UNITS_PER_FP_ARG  \
+  (TARGET_HARD_FLOAT ? (TARGET_DOUBLE_FLOAT ? 8 : 4) : 0)
+
+#define FUNCTION_VALUE_REGNO_P(N) ((N) == GP_RETURN || (N) == FP_RETURN)
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 06/12] LoongArch Port: Builtin functions.
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (4 preceding siblings ...)
  2022-03-04  7:18 ` [PATCH v8 05/12] LoongArch Port: Machine description C files and .h files xuchenghua
@ 2022-03-04  7:18 ` xuchenghua
  2022-03-08 18:43   ` Richard Sandiford
  2022-03-04  7:18 ` [PATCH v8 07/12] LoongArch Port: Builtin macros xuchenghua
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

gcc/

	* config/loongarch/larchintrin.h: New file.
	* config/loongarch/loongarch-builtins.cc: New file.
---
 gcc/config/loongarch/larchintrin.h         | 413 +++++++++++++++++
 gcc/config/loongarch/loongarch-builtins.cc | 511 +++++++++++++++++++++
 2 files changed, 924 insertions(+)
 create mode 100644 gcc/config/loongarch/larchintrin.h
 create mode 100644 gcc/config/loongarch/loongarch-builtins.cc

diff --git a/gcc/config/loongarch/larchintrin.h b/gcc/config/loongarch/larchintrin.h
new file mode 100644
index 00000000000..d8e2a743ae5
--- /dev/null
+++ b/gcc/config/loongarch/larchintrin.h
@@ -0,0 +1,413 @@
+/* Intrinsics for LoongArch BASE operations.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published
+by the Free Software Foundation; either version 3, or (at your
+option) any later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef _GCC_LOONGARCH_BASE_INTRIN_H
+#define _GCC_LOONGARCH_BASE_INTRIN_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+typedef struct drdtime
+{
+  unsigned long dvalue;
+  unsigned long dtimeid;
+} __drdtime_t;
+
+typedef struct rdtime
+{
+  unsigned int value;
+  unsigned int timeid;
+} __rdtime_t;
+
+#ifdef __loongarch64
+extern __inline __drdtime_t
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_rdtime_d (void)
+{
+  __drdtime_t drdtime;
+  __asm__ volatile (
+    "rdtime.d\t%[val],%[tid]\n\t"
+    : [val]"=&r"(drdtime.dvalue),[tid]"=&r"(drdtime.dtimeid)
+    :);
+  return drdtime;
+}
+#define __rdtime_d __builtin_loongarch_rdtime_d
+#endif
+
+extern __inline __rdtime_t
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_rdtimeh_w (void)
+{
+  __rdtime_t rdtime;
+  __asm__ volatile (
+    "rdtimeh.w\t%[val],%[tid]\n\t"
+    : [val]"=&r"(rdtime.value),[tid]"=&r"(rdtime.timeid)
+    :);
+  return rdtime;
+}
+#define __rdtimel_w __builtin_loongarch_rdtimel_w
+
+extern __inline __rdtime_t
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_rdtimel_w (void)
+{
+  __rdtime_t rdtime;
+  __asm__ volatile (
+    "rdtimel.w\t%[val],%[tid]\n\t"
+    : [val]"=&r"(rdtime.value),[tid]"=&r"(rdtime.timeid)
+    :);
+  return rdtime;
+}
+#define __rdtimeh_w __builtin_loongarch_rdtimeh_w
+
+/* Assembly instruction format:	rj, fcsr.  */
+/* Data types in instruction templates:  USI, UQI.  */
+#define __movfcsr2gr(/*ui5*/ _1) __builtin_loongarch_movfcsr2gr ((_1));
+
+/* Assembly instruction format:	0, fcsr, rj.  */
+/* Data types in instruction templates:  VOID, UQI, USI.  */
+#define __movgr2fcsr(/*ui5*/ _1, _2) \
+  __builtin_loongarch_movgr2fcsr ((unsigned short) _1, (unsigned int) _2);
+
+#if defined __loongarch64
+/* Assembly instruction format:	ui5, rj, si12.  */
+/* Data types in instruction templates:  VOID, USI, UDI, SI.  */
+#define __dcacop(/*ui5*/ _1, /*unsigned long int*/ _2, /*si12*/ _3) \
+  ((void) __builtin_loongarch_dcacop ((_1), (unsigned long int) (_2), (_3)))
+#else
+#error "Don't support this ABI."
+#endif
+
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  USI, USI.  */
+extern __inline unsigned int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__cpucfg (unsigned int _1)
+{
+  return (unsigned int) __builtin_loongarch_cpucfg ((unsigned int) _1);
+}
+
+#ifdef __loongarch64
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  DI, DI.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__asrtle_d (long int _1, long int _2)
+{
+  __builtin_loongarch_asrtle_d ((long int) _1, (long int) _2);
+}
+
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  DI, DI.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__asrtgt_d (long int _1, long int _2)
+{
+  __builtin_loongarch_asrtgt_d ((long int) _1, (long int) _2);
+}
+#endif
+
+#if defined __loongarch64
+/* Assembly instruction format:	rd, rj, ui5.  */
+/* Data types in instruction templates:  DI, DI, UQI.  */
+#define __dlddir(/*long int*/ _1, /*ui5*/ _2) \
+  ((long int) __builtin_loongarch_dlddir ((long int) (_1), (_2)))
+#else
+#error "Don't support this ABI."
+#endif
+
+#if defined __loongarch64
+/* Assembly instruction format:	rj, ui5.  */
+/* Data types in instruction templates:  VOID, DI, UQI.  */
+#define __dldpte(/*long int*/ _1, /*ui5*/ _2) \
+  ((void) __builtin_loongarch_dldpte ((long int) (_1), (_2)))
+#else
+#error "Don't support this ABI."
+#endif
+
+/* Assembly instruction format:	rd, rj, rk.  */
+/* Data types in instruction templates:  SI, QI, SI.  */
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__crc_w_b_w (char _1, int _2)
+{
+  return (int) __builtin_loongarch_crc_w_b_w ((char) _1, (int) _2);
+}
+
+/* Assembly instruction format:	rd, rj, rk.  */
+/* Data types in instruction templates:  SI, HI, SI.  */
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__crc_w_h_w (short _1, int _2)
+{
+  return (int) __builtin_loongarch_crc_w_h_w ((short) _1, (int) _2);
+}
+
+/* Assembly instruction format:	rd, rj, rk.  */
+/* Data types in instruction templates:  SI, SI, SI.  */
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__crc_w_w_w (int _1, int _2)
+{
+  return (int) __builtin_loongarch_crc_w_w_w ((int) _1, (int) _2);
+}
+
+#ifdef __loongarch64
+/* Assembly instruction format:	rd, rj, rk.  */
+/* Data types in instruction templates:  SI, DI, SI.  */
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__crc_w_d_w (long int _1, int _2)
+{
+  return (int) __builtin_loongarch_crc_w_d_w ((long int) _1, (int) _2);
+}
+#endif
+
+/* Assembly instruction format:	rd, rj, rk.  */
+/* Data types in instruction templates:  SI, QI, SI.  */
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__crcc_w_b_w (char _1, int _2)
+{
+  return (int) __builtin_loongarch_crcc_w_b_w ((char) _1, (int) _2);
+}
+
+/* Assembly instruction format:	rd, rj, rk.  */
+/* Data types in instruction templates:  SI, HI, SI.  */
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__crcc_w_h_w (short _1, int _2)
+{
+  return (int) __builtin_loongarch_crcc_w_h_w ((short) _1, (int) _2);
+}
+
+/* Assembly instruction format:	rd, rj, rk.  */
+/* Data types in instruction templates:  SI, SI, SI.  */
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__crcc_w_w_w (int _1, int _2)
+{
+  return (int) __builtin_loongarch_crcc_w_w_w ((int) _1, (int) _2);
+}
+
+#ifdef __loongarch64
+/* Assembly instruction format:	rd, rj, rk.  */
+/* Data types in instruction templates:  SI, DI, SI.  */
+extern __inline int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__crcc_w_d_w (long int _1, int _2)
+{
+  return (int) __builtin_loongarch_crcc_w_d_w ((long int) _1, (int) _2);
+}
+#endif
+
+/* Assembly instruction format:	rd, ui14.  */
+/* Data types in instruction templates:  USI, USI.  */
+#define __csrrd(/*ui14*/ _1) ((unsigned int) __builtin_loongarch_csrrd ((_1)))
+
+/* Assembly instruction format:	rd, ui14.  */
+/* Data types in instruction templates:  USI, USI, USI.  */
+#define __csrwr(/*unsigned int*/ _1, /*ui14*/ _2) \
+  ((unsigned int) __builtin_loongarch_csrwr ((unsigned int) (_1), (_2)))
+
+/* Assembly instruction format:	rd, rj, ui14.  */
+/* Data types in instruction templates:  USI, USI, USI, USI.  */
+#define __csrxchg(/*unsigned int*/ _1, /*unsigned int*/ _2, /*ui14*/ _3) \
+  ((unsigned int) __builtin_loongarch_csrxchg ((unsigned int) (_1), \
+					       (unsigned int) (_2), (_3)))
+
+#ifdef __loongarch64
+/* Assembly instruction format:	rd, ui14.  */
+/* Data types in instruction templates:  UDI, USI.  */
+#define __dcsrrd(/*ui14*/ _1) \
+  ((unsigned long int) __builtin_loongarch_dcsrrd ((_1)))
+
+/* Assembly instruction format:	rd, ui14.  */
+/* Data types in instruction templates:  UDI, UDI, USI.  */
+#define __dcsrwr(/*unsigned long int*/ _1, /*ui14*/ _2) \
+  ((unsigned long int) __builtin_loongarch_dcsrwr ((unsigned long int) (_1), \
+						   (_2)))
+
+/* Assembly instruction format:	rd, rj, ui14.  */
+/* Data types in instruction templates:  UDI, UDI, UDI, USI.  */
+#define __dcsrxchg(/*unsigned long int*/ _1, /*unsigned long int*/ _2, \
+		   /*ui14*/ _3) \
+  ((unsigned long int) __builtin_loongarch_dcsrxchg ( \
+    (unsigned long int) (_1), (unsigned long int) (_2), (_3)))
+#endif
+
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  UQI, USI.  */
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__iocsrrd_b (unsigned int _1)
+{
+  return (unsigned char) __builtin_loongarch_iocsrrd_b ((unsigned int) _1);
+}
+
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  UHI, USI.  */
+extern __inline unsigned char
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__iocsrrd_h (unsigned int _1)
+{
+  return (unsigned short) __builtin_loongarch_iocsrrd_h ((unsigned int) _1);
+}
+
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  USI, USI.  */
+extern __inline unsigned int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__iocsrrd_w (unsigned int _1)
+{
+  return (unsigned int) __builtin_loongarch_iocsrrd_w ((unsigned int) _1);
+}
+
+#ifdef __loongarch64
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  UDI, USI.  */
+extern __inline unsigned long int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__iocsrrd_d (unsigned int _1)
+{
+  return (unsigned long int) __builtin_loongarch_iocsrrd_d ((unsigned int) _1);
+}
+#endif
+
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  VOID, UQI, USI.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__iocsrwr_b (unsigned char _1, unsigned int _2)
+{
+  return (void) __builtin_loongarch_iocsrwr_b ((unsigned char) _1,
+					       (unsigned int) _2);
+}
+
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  VOID, UHI, USI.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__iocsrwr_h (unsigned short _1, unsigned int _2)
+{
+  return (void) __builtin_loongarch_iocsrwr_h ((unsigned short) _1,
+					       (unsigned int) _2);
+}
+
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  VOID, USI, USI.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__iocsrwr_w (unsigned int _1, unsigned int _2)
+{
+  return (void) __builtin_loongarch_iocsrwr_w ((unsigned int) _1,
+					       (unsigned int) _2);
+}
+
+#ifdef __loongarch64
+/* Assembly instruction format:	rd, rj.  */
+/* Data types in instruction templates:  VOID, UDI, USI.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__iocsrwr_d (unsigned long int _1, unsigned int _2)
+{
+  return (void) __builtin_loongarch_iocsrwr_d ((unsigned long int) _1,
+					       (unsigned int) _2);
+}
+#endif
+
+/* Assembly instruction format:	ui15.  */
+/* Data types in instruction templates:  UQI.  */
+#define __dbar(/*ui15*/ _1) __builtin_loongarch_dbar ((_1))
+
+/* Assembly instruction format:	ui15.  */
+/* Data types in instruction templates:  UQI.  */
+#define __ibar(/*ui15*/ _1) __builtin_loongarch_ibar ((_1))
+
+#define __builtin_loongarch_syscall(a) \
+  { \
+    __asm__ volatile ("syscall %0\n\t" ::"I"(a)); \
+  }
+#define __syscall __builtin_loongarch_syscall
+
+#define __builtin_loongarch_break(a) \
+  { \
+    __asm__ volatile ("break %0\n\t" ::"I"(a)); \
+  }
+#define __break __builtin_loongarch_break
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_tlbsrch (void)
+{
+  __asm__ volatile ("tlbsrch\n\t");
+}
+#define __tlbsrch __builtin_loongarch_tlbsrch
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_tlbrd (void)
+{
+  __asm__ volatile ("tlbrd\n\t");
+}
+#define __tlbrd __builtin_loongarch_tlbrd
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_tlbwr (void)
+{
+  __asm__ volatile ("tlbwr\n\t");
+}
+#define __tlbwr __builtin_loongarch_tlbwr
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_tlbfill (void)
+{
+  __asm__ volatile ("tlbfill\n\t");
+}
+#define __tlbfill __builtin_loongarch_tlbfill
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_tlbclr (void)
+{
+  __asm__ volatile ("tlbclr\n\t");
+}
+#define __tlbclr __builtin_loongarch_tlbclr
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_tlbflush (void)
+{
+  __asm__ volatile ("tlbflush\n\t");
+}
+#define __tlbflush __builtin_loongarch_tlbflush
+
+#ifdef __cplusplus
+}
+#endif
+#endif /* _GCC_LOONGARCH_BASE_INTRIN_H */
diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc
new file mode 100644
index 00000000000..bc92640ff8d
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -0,0 +1,511 @@
+/* Subroutines used for expanding LoongArch builtins.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "target.h"
+#include "rtl.h"
+#include "tree.h"
+#include "memmodel.h"
+#include "gimple.h"
+#include "tm_p.h"
+#include "optabs.h"
+#include "recog.h"
+#include "diagnostic.h"
+#include "fold-const.h"
+#include "expr.h"
+#include "langhooks.h"
+
+/* Macros to create an enumeration identifier for a function prototype.  */
+#define LARCH_FTYPE_NAME1(A, B) LARCH_##A##_FTYPE_##B
+#define LARCH_FTYPE_NAME2(A, B, C) LARCH_##A##_FTYPE_##B##_##C
+#define LARCH_FTYPE_NAME3(A, B, C, D) LARCH_##A##_FTYPE_##B##_##C##_##D
+#define LARCH_FTYPE_NAME4(A, B, C, D, E) \
+  LARCH_##A##_FTYPE_##B##_##C##_##D##_##E
+
+/* Classifies the prototype of a built-in function.  */
+enum loongarch_function_type
+{
+#define DEF_LARCH_FTYPE(NARGS, LIST) LARCH_FTYPE_NAME##NARGS LIST,
+#include "config/loongarch/loongarch-ftypes.def"
+#undef DEF_LARCH_FTYPE
+  LARCH_MAX_FTYPE_MAX
+};
+
+/* Specifies how a built-in function should be converted into rtl.  */
+enum loongarch_builtin_type
+{
+  /* The function corresponds directly to an .md pattern.  The return
+     value is mapped to operand 0 and the arguments are mapped to
+     operands 1 and above.  */
+  LARCH_BUILTIN_DIRECT,
+
+  /* The function corresponds directly to an .md pattern.  There is no return
+     value and the arguments are mapped to operands 0 and above.  */
+  LARCH_BUILTIN_DIRECT_NO_TARGET,
+
+};
+
+/* Invoke MACRO (COND) for each fcmp.cond.{s/d} condition.  */
+#define LARCH_FP_CONDITIONS(MACRO) \
+  MACRO (f),	\
+  MACRO (un),	\
+  MACRO (eq),	\
+  MACRO (ueq),	\
+  MACRO (olt),	\
+  MACRO (ult),	\
+  MACRO (ole),	\
+  MACRO (ule),	\
+  MACRO (sf),	\
+  MACRO (ngle),	\
+  MACRO (seq),	\
+  MACRO (ngl),	\
+  MACRO (lt),	\
+  MACRO (nge),	\
+  MACRO (le),	\
+  MACRO (ngt)
+
+/* Enumerates the codes above as LARCH_FP_COND_<X>.  */
+#define DECLARE_LARCH_COND(X) LARCH_FP_COND_##X
+enum loongarch_fp_condition
+{
+  LARCH_FP_CONDITIONS (DECLARE_LARCH_COND)
+};
+#undef DECLARE_LARCH_COND
+
+/* Index X provides the string representation of LARCH_FP_COND_<X>.  */
+#define STRINGIFY(X) #X
+const char *const
+loongarch_fp_conditions[16]= {LARCH_FP_CONDITIONS (STRINGIFY)};
+#undef STRINGIFY
+
+/* Declare an availability predicate for built-in functions that require
+ * COND to be true.  NAME is the main part of the predicate's name.  */
+#define AVAIL_ALL(NAME, COND) \
+  static unsigned int \
+  loongarch_builtin_avail_##NAME (void) \
+  { \
+    return (COND) ? 1 : 0; \
+  }
+
+static unsigned int
+loongarch_builtin_avail_default (void)
+{
+  return 1;
+}
+/* This structure describes a single built-in function.  */
+struct loongarch_builtin_description
+{
+  /* The code of the main .md file instruction.  See loongarch_builtin_type
+     for more information.  */
+  enum insn_code icode;
+
+  /* The floating-point comparison code to use with ICODE, if any.  */
+  enum loongarch_fp_condition cond;
+
+  /* The name of the built-in function.  */
+  const char *name;
+
+  /* Specifies how the function should be expanded.  */
+  enum loongarch_builtin_type builtin_type;
+
+  /* The function's prototype.  */
+  enum loongarch_function_type function_type;
+
+  /* Whether the function is available.  */
+  unsigned int (*avail) (void);
+};
+
+AVAIL_ALL (hard_float, TARGET_HARD_FLOAT_ABI)
+
+/* Construct a loongarch_builtin_description from the given arguments.
+
+   INSN is the name of the associated instruction pattern, without the
+   leading CODE_FOR_loongarch_.
+
+   CODE is the floating-point condition code associated with the
+   function.  It can be 'f' if the field is not applicable.
+
+   NAME is the name of the function itself, without the leading
+   "__builtin_loongarch_".
+
+   BUILTIN_TYPE and FUNCTION_TYPE are loongarch_builtin_description fields.
+
+   AVAIL is the name of the availability predicate, without the leading
+   loongarch_builtin_avail_.  */
+#define LARCH_BUILTIN(INSN, COND, NAME, BUILTIN_TYPE, FUNCTION_TYPE, AVAIL) \
+  { \
+    CODE_FOR_loongarch_##INSN, LARCH_FP_COND_##COND, \
+      "__builtin_loongarch_" NAME, BUILTIN_TYPE, FUNCTION_TYPE, \
+      loongarch_builtin_avail_##AVAIL \
+  }
+
+/* Define __builtin_loongarch_<INSN>, which is a LARCH_BUILTIN_DIRECT function
+   mapped to instruction CODE_FOR_loongarch_<INSN>,  FUNCTION_TYPE and AVAIL
+   are as for LARCH_BUILTIN.  */
+#define DIRECT_BUILTIN(INSN, FUNCTION_TYPE, AVAIL) \
+  LARCH_BUILTIN (INSN, f, #INSN, LARCH_BUILTIN_DIRECT, FUNCTION_TYPE, AVAIL)
+
+/* Define __builtin_loongarch_<INSN>, which is a LARCH_BUILTIN_DIRECT_NO_TARGET
+   function mapped to instruction CODE_FOR_loongarch_<INSN>,  FUNCTION_TYPE
+   and AVAIL are as for LARCH_BUILTIN.  */
+#define DIRECT_NO_TARGET_BUILTIN(INSN, FUNCTION_TYPE, AVAIL) \
+  LARCH_BUILTIN (INSN, f, #INSN, LARCH_BUILTIN_DIRECT_NO_TARGET, \
+		 FUNCTION_TYPE, AVAIL)
+
+/* Loongson support crc.  */
+#define CODE_FOR_loongarch_crc_w_b_w CODE_FOR_crc_w_b_w
+#define CODE_FOR_loongarch_crc_w_h_w CODE_FOR_crc_w_h_w
+#define CODE_FOR_loongarch_crc_w_w_w CODE_FOR_crc_w_w_w
+#define CODE_FOR_loongarch_crc_w_d_w CODE_FOR_crc_w_d_w
+#define CODE_FOR_loongarch_crcc_w_b_w CODE_FOR_crcc_w_b_w
+#define CODE_FOR_loongarch_crcc_w_h_w CODE_FOR_crcc_w_h_w
+#define CODE_FOR_loongarch_crcc_w_w_w CODE_FOR_crcc_w_w_w
+#define CODE_FOR_loongarch_crcc_w_d_w CODE_FOR_crcc_w_d_w
+
+/* Privileged state instruction.  */
+#define CODE_FOR_loongarch_cpucfg CODE_FOR_cpucfg
+#define CODE_FOR_loongarch_asrtle_d CODE_FOR_asrtle_d
+#define CODE_FOR_loongarch_asrtgt_d CODE_FOR_asrtgt_d
+#define CODE_FOR_loongarch_csrrd CODE_FOR_csrrd
+#define CODE_FOR_loongarch_dcsrrd CODE_FOR_dcsrrd
+#define CODE_FOR_loongarch_csrwr CODE_FOR_csrwr
+#define CODE_FOR_loongarch_dcsrwr CODE_FOR_dcsrwr
+#define CODE_FOR_loongarch_csrxchg CODE_FOR_csrxchg
+#define CODE_FOR_loongarch_dcsrxchg CODE_FOR_dcsrxchg
+#define CODE_FOR_loongarch_iocsrrd_b CODE_FOR_iocsrrd_b
+#define CODE_FOR_loongarch_iocsrrd_h CODE_FOR_iocsrrd_h
+#define CODE_FOR_loongarch_iocsrrd_w CODE_FOR_iocsrrd_w
+#define CODE_FOR_loongarch_iocsrrd_d CODE_FOR_iocsrrd_d
+#define CODE_FOR_loongarch_iocsrwr_b CODE_FOR_iocsrwr_b
+#define CODE_FOR_loongarch_iocsrwr_h CODE_FOR_iocsrwr_h
+#define CODE_FOR_loongarch_iocsrwr_w CODE_FOR_iocsrwr_w
+#define CODE_FOR_loongarch_iocsrwr_d CODE_FOR_iocsrwr_d
+#define CODE_FOR_loongarch_lddir CODE_FOR_lddir
+#define CODE_FOR_loongarch_dlddir CODE_FOR_dlddir
+#define CODE_FOR_loongarch_ldpte CODE_FOR_ldpte
+#define CODE_FOR_loongarch_dldpte CODE_FOR_dldpte
+#define CODE_FOR_loongarch_cacop CODE_FOR_cacop
+#define CODE_FOR_loongarch_dcacop CODE_FOR_dcacop
+#define CODE_FOR_loongarch_dbar CODE_FOR_dbar
+#define CODE_FOR_loongarch_ibar CODE_FOR_ibar
+
+static const struct loongarch_builtin_description loongarch_builtins[] = {
+#define LARCH_MOVFCSR2GR 0
+  DIRECT_BUILTIN (movfcsr2gr, LARCH_USI_FTYPE_UQI, hard_float),
+#define LARCH_MOVGR2FCSR 1
+  DIRECT_NO_TARGET_BUILTIN (movgr2fcsr, LARCH_VOID_FTYPE_UQI_USI, hard_float),
+
+  DIRECT_NO_TARGET_BUILTIN (cacop, LARCH_VOID_FTYPE_USI_USI_SI, default),
+  DIRECT_NO_TARGET_BUILTIN (dcacop, LARCH_VOID_FTYPE_USI_UDI_SI, default),
+  DIRECT_NO_TARGET_BUILTIN (dbar, LARCH_VOID_FTYPE_USI, default),
+  DIRECT_NO_TARGET_BUILTIN (ibar, LARCH_VOID_FTYPE_USI, default),
+
+  DIRECT_BUILTIN (cpucfg, LARCH_USI_FTYPE_USI, default),
+  DIRECT_BUILTIN (asrtle_d, LARCH_VOID_FTYPE_DI_DI, default),
+  DIRECT_BUILTIN (asrtgt_d, LARCH_VOID_FTYPE_DI_DI, default),
+  DIRECT_BUILTIN (dlddir, LARCH_DI_FTYPE_DI_UQI, default),
+  DIRECT_BUILTIN (lddir, LARCH_SI_FTYPE_SI_UQI, default),
+  DIRECT_NO_TARGET_BUILTIN (dldpte, LARCH_VOID_FTYPE_DI_UQI, default),
+  DIRECT_NO_TARGET_BUILTIN (ldpte, LARCH_VOID_FTYPE_SI_UQI, default),
+
+  /* CRC Instrinsic */
+
+  DIRECT_BUILTIN (crc_w_b_w, LARCH_SI_FTYPE_QI_SI, default),
+  DIRECT_BUILTIN (crc_w_h_w, LARCH_SI_FTYPE_HI_SI, default),
+  DIRECT_BUILTIN (crc_w_w_w, LARCH_SI_FTYPE_SI_SI, default),
+  DIRECT_BUILTIN (crc_w_d_w, LARCH_SI_FTYPE_DI_SI, default),
+  DIRECT_BUILTIN (crcc_w_b_w, LARCH_SI_FTYPE_QI_SI, default),
+  DIRECT_BUILTIN (crcc_w_h_w, LARCH_SI_FTYPE_HI_SI, default),
+  DIRECT_BUILTIN (crcc_w_w_w, LARCH_SI_FTYPE_SI_SI, default),
+  DIRECT_BUILTIN (crcc_w_d_w, LARCH_SI_FTYPE_DI_SI, default),
+
+  DIRECT_BUILTIN (csrrd, LARCH_USI_FTYPE_USI, default),
+  DIRECT_BUILTIN (dcsrrd, LARCH_UDI_FTYPE_USI, default),
+  DIRECT_BUILTIN (csrwr, LARCH_USI_FTYPE_USI_USI, default),
+  DIRECT_BUILTIN (dcsrwr, LARCH_UDI_FTYPE_UDI_USI, default),
+  DIRECT_BUILTIN (csrxchg, LARCH_USI_FTYPE_USI_USI_USI, default),
+  DIRECT_BUILTIN (dcsrxchg, LARCH_UDI_FTYPE_UDI_UDI_USI, default),
+  DIRECT_BUILTIN (iocsrrd_b, LARCH_UQI_FTYPE_USI, default),
+  DIRECT_BUILTIN (iocsrrd_h, LARCH_UHI_FTYPE_USI, default),
+  DIRECT_BUILTIN (iocsrrd_w, LARCH_USI_FTYPE_USI, default),
+  DIRECT_BUILTIN (iocsrrd_d, LARCH_UDI_FTYPE_USI, default),
+  DIRECT_NO_TARGET_BUILTIN (iocsrwr_b, LARCH_VOID_FTYPE_UQI_USI, default),
+  DIRECT_NO_TARGET_BUILTIN (iocsrwr_h, LARCH_VOID_FTYPE_UHI_USI, default),
+  DIRECT_NO_TARGET_BUILTIN (iocsrwr_w, LARCH_VOID_FTYPE_USI_USI, default),
+  DIRECT_NO_TARGET_BUILTIN (iocsrwr_d, LARCH_VOID_FTYPE_UDI_USI, default),
+};
+
+/* Index I is the function declaration for loongarch_builtins[I], or null if
+   the function isn't defined on this target.  */
+static GTY (()) tree loongarch_builtin_decls[ARRAY_SIZE (loongarch_builtins)];
+/* Get the index I of the function declaration for loongarch_builtin_decls[I]
+   using the instruction code or return null if not defined for the target.  */
+static GTY (()) int loongarch_get_builtin_decl_index[NUM_INSN_CODES];
+
+/* Return a type for 'const volatile void *'.  */
+
+static tree
+loongarch_build_cvpointer_type (void)
+{
+  static tree cache;
+
+  if (cache == NULL_TREE)
+    cache = build_pointer_type (build_qualified_type (void_type_node,
+						      TYPE_QUAL_CONST
+						       | TYPE_QUAL_VOLATILE));
+  return cache;
+}
+
+/* Source-level argument types.  */
+#define LARCH_ATYPE_VOID void_type_node
+#define LARCH_ATYPE_INT integer_type_node
+#define LARCH_ATYPE_POINTER ptr_type_node
+#define LARCH_ATYPE_CVPOINTER loongarch_build_cvpointer_type ()
+
+/* Standard mode-based argument types.  */
+#define LARCH_ATYPE_QI intQI_type_node
+#define LARCH_ATYPE_UQI unsigned_intQI_type_node
+#define LARCH_ATYPE_HI intHI_type_node
+#define LARCH_ATYPE_UHI unsigned_intHI_type_node
+#define LARCH_ATYPE_SI intSI_type_node
+#define LARCH_ATYPE_USI unsigned_intSI_type_node
+#define LARCH_ATYPE_DI intDI_type_node
+#define LARCH_ATYPE_UDI unsigned_intDI_type_node
+#define LARCH_ATYPE_SF float_type_node
+#define LARCH_ATYPE_DF double_type_node
+
+/* LARCH_FTYPE_ATYPESN takes N LARCH_FTYPES-like type codes and lists
+   their associated LARCH_ATYPEs.  */
+#define LARCH_FTYPE_ATYPES1(A, B) LARCH_ATYPE_##A, LARCH_ATYPE_##B
+
+#define LARCH_FTYPE_ATYPES2(A, B, C) \
+  LARCH_ATYPE_##A, LARCH_ATYPE_##B, LARCH_ATYPE_##C
+
+#define LARCH_FTYPE_ATYPES3(A, B, C, D) \
+  LARCH_ATYPE_##A, LARCH_ATYPE_##B, LARCH_ATYPE_##C, LARCH_ATYPE_##D
+
+#define LARCH_FTYPE_ATYPES4(A, B, C, D, E) \
+  LARCH_ATYPE_##A, LARCH_ATYPE_##B, LARCH_ATYPE_##C, LARCH_ATYPE_##D, \
+  LARCH_ATYPE_##E
+
+/* Return the function type associated with function prototype TYPE.  */
+
+static tree
+loongarch_build_function_type (enum loongarch_function_type type)
+{
+  static tree types[(int) LARCH_MAX_FTYPE_MAX];
+
+  if (types[(int) type] == NULL_TREE)
+    switch (type)
+      {
+#define DEF_LARCH_FTYPE(NUM, ARGS) \
+  case LARCH_FTYPE_NAME##NUM ARGS: \
+    types[(int) type] \
+      = build_function_type_list (LARCH_FTYPE_ATYPES##NUM ARGS, NULL_TREE); \
+    break;
+#include "config/loongarch/loongarch-ftypes.def"
+#undef DEF_LARCH_FTYPE
+      default:
+	gcc_unreachable ();
+      }
+
+  return types[(int) type];
+}
+
+/* Implement TARGET_INIT_BUILTINS.  */
+
+void
+loongarch_init_builtins (void)
+{
+  const struct loongarch_builtin_description *d;
+  unsigned int i;
+  tree type;
+
+  /* Iterate through all of the bdesc arrays, initializing all of the
+     builtin functions.  */
+  for (i = 0; i < ARRAY_SIZE (loongarch_builtins); i++)
+    {
+      d = &loongarch_builtins[i];
+      if (d->avail ())
+	{
+	  type = loongarch_build_function_type (d->function_type);
+	  loongarch_builtin_decls[i]
+	    = add_builtin_function (d->name, type, i, BUILT_IN_MD, NULL,
+				    NULL);
+	  loongarch_get_builtin_decl_index[d->icode] = i;
+	}
+    }
+}
+
+/* Implement TARGET_BUILTIN_DECL.  */
+
+tree
+loongarch_builtin_decl (unsigned int code, bool initialize_p ATTRIBUTE_UNUSED)
+{
+  if (code >= ARRAY_SIZE (loongarch_builtins))
+    return error_mark_node;
+  return loongarch_builtin_decls[code];
+}
+
+/* Take argument ARGNO from EXP's argument list and convert it into
+   an expand operand.  Store the operand in *OP.  */
+
+static void
+loongarch_prepare_builtin_arg (struct expand_operand *op, tree exp,
+			       unsigned int argno)
+{
+  tree arg;
+  rtx value;
+
+  arg = CALL_EXPR_ARG (exp, argno);
+  value = expand_normal (arg);
+  create_input_operand (op, value, TYPE_MODE (TREE_TYPE (arg)));
+}
+
+/* Expand instruction ICODE as part of a built-in function sequence.
+   Use the first NOPS elements of OPS as the instruction's operands.
+   HAS_TARGET_P is true if operand 0 is a target; it is false if the
+   instruction has no target.
+
+   Return the target rtx if HAS_TARGET_P, otherwise return const0_rtx.  */
+
+static rtx
+loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
+			       struct expand_operand *ops, bool has_target_p)
+{
+  if (!maybe_expand_insn (icode, nops, ops))
+    {
+      error ("invalid argument to built-in function");
+      return has_target_p ? gen_reg_rtx (ops[0].mode) : const0_rtx;
+    }
+  return has_target_p ? ops[0].value : const0_rtx;
+}
+
+/* Expand a LARCH_BUILTIN_DIRECT or LARCH_BUILTIN_DIRECT_NO_TARGET function;
+   HAS_TARGET_P says which.  EXP is the CALL_EXPR that calls the function
+   and ICODE is the code of the associated .md pattern.  TARGET, if nonnull,
+   suggests a good place to put the result.  */
+
+static rtx
+loongarch_expand_builtin_direct (enum insn_code icode, rtx target, tree exp,
+				 bool has_target_p)
+{
+  struct expand_operand ops[MAX_RECOG_OPERANDS];
+  int opno, argno;
+
+  /* Map any target to operand 0.  */
+  opno = 0;
+  if (has_target_p)
+    create_output_operand (&ops[opno++], target, TYPE_MODE (TREE_TYPE (exp)));
+
+  /* Map the arguments to the other operands.  */
+  gcc_assert (opno + call_expr_nargs (exp)
+	      == insn_data[icode].n_generator_args);
+  for (argno = 0; argno < call_expr_nargs (exp); argno++)
+    loongarch_prepare_builtin_arg (&ops[opno++], exp, argno);
+
+  return loongarch_expand_builtin_insn (icode, opno, ops, has_target_p);
+}
+
+/* Implement TARGET_EXPAND_BUILTIN.  */
+
+rtx
+loongarch_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
+			  machine_mode mode ATTRIBUTE_UNUSED,
+			  int ignore ATTRIBUTE_UNUSED)
+{
+  tree fndecl;
+  unsigned int fcode, avail;
+  const struct loongarch_builtin_description *d;
+
+  fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
+  fcode = DECL_MD_FUNCTION_CODE (fndecl);
+  gcc_assert (fcode < ARRAY_SIZE (loongarch_builtins));
+  d = &loongarch_builtins[fcode];
+  avail = d->avail ();
+  gcc_assert (avail != 0);
+  switch (d->builtin_type)
+    {
+    case LARCH_BUILTIN_DIRECT:
+      return loongarch_expand_builtin_direct (d->icode, target, exp, true);
+
+    case LARCH_BUILTIN_DIRECT_NO_TARGET:
+      return loongarch_expand_builtin_direct (d->icode, target, exp, false);
+    }
+  gcc_unreachable ();
+}
+
+/* Implement TARGET_ATOMIC_ASSIGN_EXPAND_FENV.  */
+
+void
+loongarch_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
+{
+  if (!TARGET_HARD_FLOAT_ABI)
+    return;
+  tree exceptions_var = create_tmp_var_raw (LARCH_ATYPE_USI);
+  tree fcsr_orig_var = create_tmp_var_raw (LARCH_ATYPE_USI);
+  tree fcsr_mod_var = create_tmp_var_raw (LARCH_ATYPE_USI);
+  tree const0 = build_int_cst (LARCH_ATYPE_UQI, 0);
+  tree get_fcsr = loongarch_builtin_decls[LARCH_MOVFCSR2GR];
+  tree set_fcsr = loongarch_builtin_decls[LARCH_MOVGR2FCSR];
+  tree get_fcsr_hold_call = build_call_expr (get_fcsr, 1, const0);
+  tree hold_assign_orig = build4 (TARGET_EXPR, LARCH_ATYPE_USI,
+				  fcsr_orig_var, get_fcsr_hold_call,
+				  NULL, NULL);
+  tree hold_mod_val = build2 (BIT_AND_EXPR, LARCH_ATYPE_USI, fcsr_orig_var,
+			      build_int_cst (LARCH_ATYPE_USI, 0xffe0ffe0));
+  tree hold_assign_mod = build4 (TARGET_EXPR, LARCH_ATYPE_USI,
+				 fcsr_mod_var, hold_mod_val, NULL, NULL);
+  tree set_fcsr_hold_call = build_call_expr (set_fcsr, 2, const0,
+					     fcsr_mod_var);
+  tree hold_all = build2 (COMPOUND_EXPR, LARCH_ATYPE_USI, hold_assign_orig,
+			  hold_assign_mod);
+  *hold = build2 (COMPOUND_EXPR, void_type_node, hold_all, set_fcsr_hold_call);
+
+  *clear = build_call_expr (set_fcsr, 2, const0, fcsr_mod_var);
+
+  tree get_fcsr_update_call = build_call_expr (get_fcsr, 1, const0);
+  *update = build4 (TARGET_EXPR, LARCH_ATYPE_USI, exceptions_var,
+		    get_fcsr_update_call, NULL, NULL);
+  tree set_fcsr_update_call = build_call_expr (set_fcsr, 2, const0,
+					       fcsr_orig_var);
+  *update = build2 (COMPOUND_EXPR, void_type_node, *update,
+		    set_fcsr_update_call);
+  tree atomic_feraiseexcept
+    = builtin_decl_implicit (BUILT_IN_ATOMIC_FERAISEEXCEPT);
+  tree int_exceptions_var = fold_convert (integer_type_node, exceptions_var);
+  tree atomic_feraiseexcept_call = build_call_expr (atomic_feraiseexcept, 1,
+						    int_exceptions_var);
+  *update = build2 (COMPOUND_EXPR, void_type_node, *update,
+		    atomic_feraiseexcept_call);
+}
+
+/* Implement TARGET_BUILTIN_VA_LIST.  */
+
+tree
+loongarch_build_builtin_va_list (void)
+{
+  return ptr_type_node;
+}
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 07/12] LoongArch Port: Builtin macros.
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (5 preceding siblings ...)
  2022-03-04  7:18 ` [PATCH v8 06/12] LoongArch Port: Builtin functions xuchenghua
@ 2022-03-04  7:18 ` xuchenghua
  2022-03-04  7:18 ` [PATCH v8 08/12] LoongArch Port: libgcc xuchenghua
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

gcc/

	*config/loongarch/loongarch-c.cc
---
 gcc/config/loongarch/loongarch-c.cc | 109 ++++++++++++++++++++++++++++
 1 file changed, 109 insertions(+)
 create mode 100644 gcc/config/loongarch/loongarch-c.cc

diff --git a/gcc/config/loongarch/loongarch-c.cc b/gcc/config/loongarch/loongarch-c.cc
new file mode 100644
index 00000000000..e914bf306d5
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -0,0 +1,109 @@
+/* LoongArch-specific code for C family languages.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "c-family/c-common.h"
+#include "cpplib.h"
+
+#define preprocessing_asm_p() (cpp_get_options (pfile)->lang == CLK_ASM)
+#define builtin_define(TXT) cpp_define (pfile, TXT)
+#define builtin_assert(TXT) cpp_assert (pfile, TXT)
+
+/* Define preprocessor macros for the -march and -mtune options.
+   PREFIX is either _LOONGARCH_ARCH or _LOONGARCH_TUNE, INFO is
+   the selected processor.  If INFO's canonical name is "foo",
+   define PREFIX to be "foo", and define an additional macro
+   PREFIX_FOO.  */
+#define LARCH_CPP_SET_PROCESSOR(PREFIX, CPU_TYPE)			\
+  do									\
+    {									\
+      char *macro, *p;							\
+      int cpu_type = (CPU_TYPE);					\
+									\
+      macro = concat ((PREFIX), "_",					\
+		      loongarch_cpu_strings[cpu_type], NULL);		\
+      for (p = macro; *p != 0; p++)					\
+	*p = TOUPPER (*p);						\
+									\
+      builtin_define (macro);						\
+      builtin_define_with_value ((PREFIX),				\
+				 loongarch_cpu_strings[cpu_type], 1);	\
+      free (macro);							\
+    }									\
+  while (0)
+
+void
+loongarch_cpu_cpp_builtins (cpp_reader *pfile)
+{
+  builtin_assert ("machine=loongarch");
+  builtin_assert ("cpu=loongarch");
+  builtin_define ("__loongarch__");
+
+  LARCH_CPP_SET_PROCESSOR ("_LOONGARCH_ARCH", __ACTUAL_ARCH);
+  LARCH_CPP_SET_PROCESSOR ("_LOONGARCH_TUNE", __ACTUAL_TUNE);
+
+  /* Base architecture / ABI.  */
+  if (TARGET_64BIT)
+    {
+      builtin_define ("__loongarch_grlen=64");
+      builtin_define ("__loongarch64");
+    }
+
+  if (TARGET_ABI_LP64)
+    {
+      builtin_define ("_ABILP64=3");
+      builtin_define ("_LOONGARCH_SIM=_ABILP64");
+      builtin_define ("__loongarch_lp64");
+    }
+
+  /* These defines reflect the ABI in use, not whether the
+     FPU is directly accessible.  */
+  if (TARGET_DOUBLE_FLOAT_ABI)
+    builtin_define ("__loongarch_double_float=1");
+  else if (TARGET_SINGLE_FLOAT_ABI)
+    builtin_define ("__loongarch_single_float=1");
+
+  if (TARGET_DOUBLE_FLOAT_ABI || TARGET_SINGLE_FLOAT_ABI)
+    builtin_define ("__loongarch_hard_float=1");
+  else
+    builtin_define ("__loongarch_soft_float=1");
+
+
+  /* ISA Extensions.  */
+  if (TARGET_DOUBLE_FLOAT)
+    builtin_define ("__loongarch_frlen=64");
+  else if (TARGET_SINGLE_FLOAT)
+    builtin_define ("__loongarch_frlen=32");
+  else
+    builtin_define ("__loongarch_frlen=0");
+
+  /* Native Data Sizes.  */
+  builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
+  builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
+  builtin_define_with_int_value ("_LOONGARCH_SZPTR", POINTER_SIZE);
+  builtin_define_with_int_value ("_LOONGARCH_FPSET", 32 / MAX_FPRS_PER_FMT);
+  builtin_define_with_int_value ("_LOONGARCH_SPFPSET", 32);
+
+}
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 08/12] LoongArch Port: libgcc
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (6 preceding siblings ...)
  2022-03-04  7:18 ` [PATCH v8 07/12] LoongArch Port: Builtin macros xuchenghua
@ 2022-03-04  7:18 ` xuchenghua
  2022-03-08 18:59   ` Richard Sandiford
  2022-03-04  7:18 ` [PATCH v8 09/12] LoongArch Port: Regenerate libgcc/configure xuchenghua
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

libgcc/

        * config/loongarch/crtfastmath.c: New file.
        * config/loongarch/crti.S: Like wise.
        * config/loongarch/crtn.S: Like wise.
        * config/loongarch/linux-unwind.h: Like wise.
        * config/loongarch/sfp-machine.h: Like wise.
        * config/loongarch/t-crtstuff: Like wise.
        * config/loongarch/t-loongarch: Like wise.
        * config/loongarch/t-loongarch64: Like wise.
        * config/loongarch/t-softfp-tf: Like wise.
        * config.host: Add LoongArch tuples.
        * configure.ac: Add LoongArch support.
---
 libgcc/config.host                     |  28 ++++-
 libgcc/config/loongarch/crtfastmath.c  |  52 +++++++++
 libgcc/config/loongarch/crti.S         |  43 +++++++
 libgcc/config/loongarch/crtn.S         |  39 +++++++
 libgcc/config/loongarch/linux-unwind.h |  80 +++++++++++++
 libgcc/config/loongarch/sfp-machine.h  | 152 +++++++++++++++++++++++++
 libgcc/config/loongarch/t-crtstuff     |   5 +
 libgcc/config/loongarch/t-loongarch    |   7 ++
 libgcc/config/loongarch/t-loongarch64  |   1 +
 libgcc/config/loongarch/t-softfp-tf    |   3 +
 libgcc/configure.ac                    |   2 +-
 11 files changed, 410 insertions(+), 2 deletions(-)
 create mode 100644 libgcc/config/loongarch/crtfastmath.c
 create mode 100644 libgcc/config/loongarch/crti.S
 create mode 100644 libgcc/config/loongarch/crtn.S
 create mode 100644 libgcc/config/loongarch/linux-unwind.h
 create mode 100644 libgcc/config/loongarch/sfp-machine.h
 create mode 100644 libgcc/config/loongarch/t-crtstuff
 create mode 100644 libgcc/config/loongarch/t-loongarch
 create mode 100644 libgcc/config/loongarch/t-loongarch64
 create mode 100644 libgcc/config/loongarch/t-softfp-tf

diff --git a/libgcc/config.host b/libgcc/config.host
index 094fd3ad254..8c56fcae5d2 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -138,6 +138,22 @@ hppa*-*-*)
 lm32*-*-*)
 	cpu_type=lm32
 	;;
+loongarch*-*-*)
+	cpu_type=loongarch
+	tmake_file="loongarch/t-loongarch"
+	if test "${libgcc_cv_loongarch_hard_float}" = yes; then
+		tmake_file="${tmake_file} t-hardfp-sfdf t-hardfp"
+	else
+		tmake_file="${tmake_file} t-softfp-sfdf"
+	fi
+	if test "${ac_cv_sizeof_long_double}" = 16; then
+		tmake_file="${tmake_file} loongarch/t-softfp-tf"
+	fi
+	if test "${host_address}" = 64; then
+		tmake_file="${tmake_file} loongarch/t-loongarch64"
+	fi
+	tmake_file="${tmake_file} t-softfp"
+	;;
 m32r*-*-*)
         cpu_type=m32r
         ;;
@@ -925,7 +941,17 @@ lm32-*-rtems*)
 lm32-*-uclinux*)
         extra_parts="$extra_parts crtbegin.o crtendS.o crtbeginT.o"
         tmake_file="lm32/t-lm32 lm32/t-uclinux t-libgcc-pic t-softfp-sfdf t-softfp"
-	;;	
+	;;
+loongarch*-*-linux*)
+	extra_parts="$extra_parts crtfastmath.o"
+	tmake_file="${tmake_file} t-crtfm loongarch/t-crtstuff"
+	case ${host} in
+	  *)
+	    tmake_file="${tmake_file} t-slibgcc-libgcc"
+	    ;;
+	esac
+	md_unwind_header=loongarch/linux-unwind.h
+	;;
 m32r-*-elf*)
 	tmake_file="$tmake_file m32r/t-m32r t-fdpbit"
 	extra_parts="$extra_parts crtinit.o crtfini.o"
diff --git a/libgcc/config/loongarch/crtfastmath.c b/libgcc/config/loongarch/crtfastmath.c
new file mode 100644
index 00000000000..52b0d6da087
--- /dev/null
+++ b/libgcc/config/loongarch/crtfastmath.c
@@ -0,0 +1,52 @@
+/* Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+   Based on MIPS target for GNU compiler.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License
+and a copy of the GCC Runtime Library Exception along with this
+program; see the files COPYING3 and COPYING.RUNTIME respectively.
+If not, see <http://www.gnu.org/licenses/>.  */
+
+#ifdef __loongarch_hard_float
+
+/* Rounding control.  */
+#define _FPU_RC_NEAREST 0x000     /* RECOMMENDED.  */
+#define _FPU_RC_ZERO    0x100
+#define _FPU_RC_UP      0x200
+#define _FPU_RC_DOWN    0x300
+
+/* Enable interrupts for IEEE exceptions.  */
+#define _FPU_IEEE     0x0000001F
+
+/* Macros for accessing the hardware control word.  */
+#define _FPU_GETCW(cw) __asm__ volatile ("movfcsr2gr %0,$r0" : "=r" (cw))
+#define _FPU_SETCW(cw) __asm__ volatile ("movgr2fcsr $r0,%0" : : "r" (cw))
+
+static void __attribute__((constructor))
+set_fast_math (void)
+{
+  unsigned int fcr;
+
+  /* Flush to zero, round to nearest, IEEE exceptions disabled.  */
+  fcr = _FPU_RC_NEAREST;
+
+  _FPU_SETCW (fcr);
+}
+
+#endif /* __loongarch_hard_float  */
diff --git a/libgcc/config/loongarch/crti.S b/libgcc/config/loongarch/crti.S
new file mode 100644
index 00000000000..27b7eab3626
--- /dev/null
+++ b/libgcc/config/loongarch/crti.S
@@ -0,0 +1,43 @@
+/* Copyright (C) 2021-2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+/* 4 slots for argument spill area.  1 for cpreturn, 1 for stack.
+   Return spill offset of 8.  Aligned to 16 bytes for lp64.  */
+
+	.section .init,"ax",@progbits
+	.globl	_init
+	.type	_init,@function
+_init:
+	addi.d   $r3,$r3,-16
+	st.d      $r1,$r3,8
+	addi.d   $r3,$r3,16
+	jirl	$r0,$r1,0
+
+	.section .fini,"ax",@progbits
+	.globl	_fini
+	.type	_fini,@function
+_fini:
+	addi.d   $r3,$r3,-16
+	st.d      $r1,$r3,8
+	addi.d   $r3,$r3,16
+	jirl	$r0,$r1,0
diff --git a/libgcc/config/loongarch/crtn.S b/libgcc/config/loongarch/crtn.S
new file mode 100644
index 00000000000..61cf652e856
--- /dev/null
+++ b/libgcc/config/loongarch/crtn.S
@@ -0,0 +1,39 @@
+/* Copyright (C) 2021-2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+/* 4 slots for argument spill area.  1 for cpreturn, 1 for stack.
+   Return spill offset of 8.  Aligned to 16 bytes for lp64.  */
+
+
+	.section .init,"ax",@progbits
+init:
+	ld.d      $r1,$r3,8
+	addi.d	$r3,$r3,16
+	jirl	$r0,$r1,0
+
+	.section .fini,"ax",@progbits
+fini:
+	ld.d	$r1,$r3,8
+	addi.d	$r3,$r3,16
+	jirl	$r0,$r1,0
+
diff --git a/libgcc/config/loongarch/linux-unwind.h b/libgcc/config/loongarch/linux-unwind.h
new file mode 100644
index 00000000000..11723b8cdda
--- /dev/null
+++ b/libgcc/config/loongarch/linux-unwind.h
@@ -0,0 +1,80 @@
+/* DWARF2 EH unwinding support for LoongArch Linux.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef inhibit_libc
+/* Do code reading to identify a signal frame, and set the frame
+   state data appropriately.  See unwind-dw2.c for the structs.  */
+
+#include <signal.h>
+#include <sys/syscall.h>
+#include <sys/ucontext.h>
+
+#define MD_FALLBACK_FRAME_STATE_FOR loongarch_fallback_frame_state
+
+static _Unwind_Reason_Code
+loongarch_fallback_frame_state (struct _Unwind_Context *context,
+				_Unwind_FrameState *fs)
+{
+  u_int32_t *pc = (u_int32_t *) context->ra;
+  struct sigcontext *sc;
+  _Unwind_Ptr new_cfa;
+  int i;
+
+  /* 03822c0b dli a7, 0x8b (sigreturn)  */
+  /* 002b0000 syscall 0  */
+  if (pc[1] != 0x002b0000)
+    return _URC_END_OF_STACK;
+  if (pc[0] == 0x03822c0b)
+    {
+      struct rt_sigframe
+      {
+	siginfo_t info;
+	ucontext_t uc;
+      } *rt_ = context->cfa;
+      sc = (struct sigcontext *) (void *) &rt_->uc.uc_mcontext;
+    }
+  else
+    return _URC_END_OF_STACK;
+
+  new_cfa = (_Unwind_Ptr) sc;
+  fs->regs.cfa_how = CFA_REG_OFFSET;
+  fs->regs.cfa_reg = __LIBGCC_STACK_POINTER_REGNUM__;
+  fs->regs.cfa_offset = new_cfa - (_Unwind_Ptr) context->cfa;
+
+  for (i = 0; i < 32; i++)
+    {
+      fs->regs.reg[i].how = REG_SAVED_OFFSET;
+      fs->regs.reg[i].loc.offset = (_Unwind_Ptr) & (sc->sc_regs[i]) - new_cfa;
+    }
+
+  fs->signal_frame = 1;
+  fs->regs.reg[__LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__].how
+    = REG_SAVED_VAL_OFFSET;
+  fs->regs.reg[__LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__].loc.offset
+    = (_Unwind_Ptr) (sc->sc_pc) - new_cfa;
+  fs->retaddr_column = __LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__;
+
+  return _URC_NO_REASON;
+}
+#endif
diff --git a/libgcc/config/loongarch/sfp-machine.h b/libgcc/config/loongarch/sfp-machine.h
new file mode 100644
index 00000000000..c81be71815e
--- /dev/null
+++ b/libgcc/config/loongarch/sfp-machine.h
@@ -0,0 +1,152 @@
+/* softfp machine description for LoongArch.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+#ifdef __loongarch64
+#define _FP_W_TYPE_SIZE 64
+#define _FP_W_TYPE unsigned long long
+#define _FP_WS_TYPE signed long long
+#define _FP_I_TYPE long long
+
+typedef int TItype __attribute__ ((mode (TI)));
+typedef unsigned int UTItype __attribute__ ((mode (TI)));
+#define TI_BITS (__CHAR_BIT__ * (int) sizeof (TItype))
+
+#define _FP_MUL_MEAT_S(R, X, Y) \
+  _FP_MUL_MEAT_1_wide (_FP_WFRACBITS_S, R, X, Y, umul_ppmm)
+#define _FP_MUL_MEAT_D(R, X, Y) \
+  _FP_MUL_MEAT_1_wide (_FP_WFRACBITS_D, R, X, Y, umul_ppmm)
+#define _FP_MUL_MEAT_Q(R, X, Y) \
+  _FP_MUL_MEAT_2_wide (_FP_WFRACBITS_Q, R, X, Y, umul_ppmm)
+
+#define _FP_DIV_MEAT_S(R, X, Y) _FP_DIV_MEAT_1_udiv_norm (S, R, X, Y)
+#define _FP_DIV_MEAT_D(R, X, Y) _FP_DIV_MEAT_1_udiv_norm (D, R, X, Y)
+#define _FP_DIV_MEAT_Q(R, X, Y) _FP_DIV_MEAT_2_udiv (Q, R, X, Y)
+
+#define _FP_NANFRAC_S ((_FP_QNANBIT_S << 1) - 1)
+#define _FP_NANFRAC_D ((_FP_QNANBIT_D << 1) - 1)
+#define _FP_NANFRAC_Q ((_FP_QNANBIT_Q << 1) - 1), -1
+#else
+#define _FP_W_TYPE_SIZE 32
+#define _FP_W_TYPE unsigned int
+#define _FP_WS_TYPE signed int
+#define _FP_I_TYPE int
+
+#define _FP_MUL_MEAT_S(R, X, Y) \
+  _FP_MUL_MEAT_1_wide (_FP_WFRACBITS_S, R, X, Y, umul_ppmm)
+#define _FP_MUL_MEAT_D(R, X, Y) \
+  _FP_MUL_MEAT_2_wide (_FP_WFRACBITS_D, R, X, Y, umul_ppmm)
+#define _FP_MUL_MEAT_Q(R, X, Y) \
+  _FP_MUL_MEAT_4_wide (_FP_WFRACBITS_Q, R, X, Y, umul_ppmm)
+
+#define _FP_DIV_MEAT_S(R, X, Y) _FP_DIV_MEAT_1_udiv_norm (S, R, X, Y)
+#define _FP_DIV_MEAT_D(R, X, Y) _FP_DIV_MEAT_2_udiv (D, R, X, Y)
+#define _FP_DIV_MEAT_Q(R, X, Y) _FP_DIV_MEAT_4_udiv (Q, R, X, Y)
+
+#define _FP_NANFRAC_S ((_FP_QNANBIT_S << 1) - 1)
+#define _FP_NANFRAC_D ((_FP_QNANBIT_D << 1) - 1), -1
+#define _FP_NANFRAC_Q ((_FP_QNANBIT_Q << 1) - 1), -1, -1, -1
+#endif
+
+/* The type of the result of a floating point comparison.  This must
+   match __libgcc_cmp_return__ in GCC for the target.  */
+typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__)));
+#define CMPtype __gcc_CMPtype
+
+#define _FP_NANSIGN_S 0
+#define _FP_NANSIGN_D 0
+#define _FP_NANSIGN_Q 0
+
+#define _FP_KEEPNANFRACP 1
+#define _FP_QNANNEGATEDP 0
+
+/* NaN payloads should be preserved for NAN2008.  */
+#define _FP_CHOOSENAN(fs, wc, R, X, Y, OP) \
+  do \
+    { \
+      R##_s = X##_s; \
+      _FP_FRAC_COPY_##wc (R, X); \
+      R##_c = FP_CLS_NAN; \
+    } \
+  while (0)
+
+#ifdef __loongarch_hard_float
+#define FP_EX_INVALID 0x100000
+#define FP_EX_DIVZERO 0x080000
+#define FP_EX_OVERFLOW 0x040000
+#define FP_EX_UNDERFLOW 0x020000
+#define FP_EX_INEXACT 0x010000
+#define FP_EX_ALL \
+  (FP_EX_INVALID | FP_EX_DIVZERO | FP_EX_OVERFLOW | FP_EX_UNDERFLOW \
+   | FP_EX_INEXACT)
+
+#define FP_EX_ENABLE_SHIFT 16
+#define FP_EX_CAUSE_SHIFT 8
+
+#define FP_RND_NEAREST 0x000
+#define FP_RND_ZERO 0x100
+#define FP_RND_PINF 0x200
+#define FP_RND_MINF 0x300
+#define FP_RND_MASK 0x300
+
+#define _FP_DECL_EX \
+  unsigned long int _fcsr __attribute__ ((unused)) = FP_RND_NEAREST
+
+#define FP_INIT_ROUNDMODE \
+  do \
+    { \
+      _fcsr = __builtin_loongarch_movfcsr2gr (0); \
+    } \
+  while (0)
+
+#define FP_ROUNDMODE (_fcsr & FP_RND_MASK)
+
+#define FP_TRAPPING_EXCEPTIONS ((_fcsr << FP_EX_ENABLE_SHIFT) & FP_EX_ALL)
+
+#define FP_HANDLE_EXCEPTIONS \
+  do \
+    { \
+      _fcsr &= ~(FP_EX_ALL << FP_EX_CAUSE_SHIFT); \
+      _fcsr |= _fex | (_fex << FP_EX_CAUSE_SHIFT); \
+      __builtin_loongarch_movgr2fcsr (0, _fcsr); \
+    } \
+  while (0)
+
+#else
+#define FP_EX_INVALID (1 << 4)
+#define FP_EX_DIVZERO (1 << 3)
+#define FP_EX_OVERFLOW (1 << 2)
+#define FP_EX_UNDERFLOW (1 << 1)
+#define FP_EX_INEXACT (1 << 0)
+#endif
+
+#define _FP_TININESS_AFTER_ROUNDING 1
+
+#define __LITTLE_ENDIAN 1234
+
+#define __BYTE_ORDER __LITTLE_ENDIAN
+
+/* Define ALIASNAME as a strong alias for NAME.  */
+#define strong_alias(name, aliasname) _strong_alias (name, aliasname)
+#define _strong_alias(name, aliasname) \
+  extern __typeof (name) aliasname __attribute__ ((alias (#name)));
diff --git a/libgcc/config/loongarch/t-crtstuff b/libgcc/config/loongarch/t-crtstuff
new file mode 100644
index 00000000000..b8c36eb66b7
--- /dev/null
+++ b/libgcc/config/loongarch/t-crtstuff
@@ -0,0 +1,5 @@
+# -fasynchronous-unwind-tables is on by default for LoongArch.
+# We turn it off for crt*.o because it would make __EH_FRAME_BEGIN__ point
+# to .eh_frame data from crtbeginT.o instead of the user-defined object
+# during static linking.
+CRTSTUFF_T_CFLAGS += -fno-omit-frame-pointer -fno-asynchronous-unwind-tables
diff --git a/libgcc/config/loongarch/t-loongarch b/libgcc/config/loongarch/t-loongarch
new file mode 100644
index 00000000000..2a7dbf6ca83
--- /dev/null
+++ b/libgcc/config/loongarch/t-loongarch
@@ -0,0 +1,7 @@
+LIB2_SIDITI_CONV_FUNCS = yes
+
+softfp_float_modes :=
+softfp_int_modes := si di
+softfp_extensions :=
+softfp_truncations :=
+softfp_exclude_libgcc2 := n
diff --git a/libgcc/config/loongarch/t-loongarch64 b/libgcc/config/loongarch/t-loongarch64
new file mode 100644
index 00000000000..a1e3513e288
--- /dev/null
+++ b/libgcc/config/loongarch/t-loongarch64
@@ -0,0 +1 @@
+softfp_int_modes += ti
diff --git a/libgcc/config/loongarch/t-softfp-tf b/libgcc/config/loongarch/t-softfp-tf
new file mode 100644
index 00000000000..306677b1255
--- /dev/null
+++ b/libgcc/config/loongarch/t-softfp-tf
@@ -0,0 +1,3 @@
+softfp_float_modes += tf
+softfp_extensions += sftf dftf
+softfp_truncations += tfsf tfdf
diff --git a/libgcc/configure.ac b/libgcc/configure.ac
index 6f0b67c2adc..2fc9d5d7c93 100644
--- a/libgcc/configure.ac
+++ b/libgcc/configure.ac
@@ -324,7 +324,7 @@ AC_CACHE_CHECK([whether assembler supports CFI directives], [libgcc_cv_cfi],
 # word size rather than the address size.
 cat > conftest.c <<EOF
 #if defined(__x86_64__) || (!defined(__i386__) && defined(__LP64__)) \
-    || defined(__mips64)
+    || defined(__mips64) || defined(__loongarch64)
 host_address=64
 #else
 host_address=32
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 09/12] LoongArch Port: Regenerate libgcc/configure.
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (7 preceding siblings ...)
  2022-03-04  7:18 ` [PATCH v8 08/12] LoongArch Port: libgcc xuchenghua
@ 2022-03-04  7:18 ` xuchenghua
  2022-03-04  7:18 ` [PATCH v8 10/12] LoongArch Port: libgomp xuchenghua
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

    * libgcc/configure: Regenerated.
---
 libgcc/configure | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libgcc/configure b/libgcc/configure
index 52bf25d4e94..1f9b2ac578b 100755
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -2403,6 +2403,9 @@ case "${host}" in
 	# sets the default TLS model and affects inlining.
 	PICFLAG=-fPIC
 	;;
+    loongarch*-*-*)
+	PICFLAG=-fpic
+	;;
     mips-sgi-irix6*)
 	# PIC is the default.
 	;;
@@ -5073,7 +5076,7 @@ $as_echo "$libgcc_cv_cfi" >&6; }
 # word size rather than the address size.
 cat > conftest.c <<EOF
 #if defined(__x86_64__) || (!defined(__i386__) && defined(__LP64__)) \
-    || defined(__mips64)
+    || defined(__mips64) || defined(__loongarch64)
 host_address=64
 #else
 host_address=32
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 10/12] LoongArch Port: libgomp
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (8 preceding siblings ...)
  2022-03-04  7:18 ` [PATCH v8 09/12] LoongArch Port: Regenerate libgcc/configure xuchenghua
@ 2022-03-04  7:18 ` xuchenghua
  2022-03-04  7:18 ` [PATCH v8 11/12] LoongArch Port: gcc/testsuite xuchenghua
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

libgomp/

        * configure.tgt: Add LoongArch triplet.
---
 libgomp/configure.tgt | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/libgomp/configure.tgt b/libgomp/configure.tgt
index d4f1e741b5a..2cd7272fcd8 100644
--- a/libgomp/configure.tgt
+++ b/libgomp/configure.tgt
@@ -56,6 +56,10 @@ if test x$enable_linux_futex = xyes; then
 	config_path="linux/ia64 linux posix"
 	;;
 
+    loongarch*-*-linux*)
+	config_path="linux posix"
+	;;
+
     mips*-*-linux*)
 	config_path="linux/mips linux posix"
 	;;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 11/12] LoongArch Port: gcc/testsuite
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (9 preceding siblings ...)
  2022-03-04  7:18 ` [PATCH v8 10/12] LoongArch Port: libgomp xuchenghua
@ 2022-03-04  7:18 ` xuchenghua
  2022-03-08 19:06   ` Richard Sandiford
  2022-03-04  7:18 ` [PATCH v8 12/12] LoongArch Port: Add doc xuchenghua
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

gcc/testsuite/

        * g++.dg/cpp0x/constexpr-rom.C: Add build options for LoongArch.
        * g++.old-deja/g++.abi/ptrmem.C: Add LoongArch support.
        * g++.old-deja/g++.pt/ptrmem6.C: xfail for LoongArch.
        * gcc.dg/20020312-2.c: Add LoongArch support.
        * gcc.dg/loop-8.c: Skip on LoongArch.
        * gcc.dg/torture/stackalign/builtin-apply-2.c: Likewise.
        * gcc.dg/tree-ssa/ssa-fre-3.c: Likewise.
        * go.test/go-test.exp: Define the LoongArch target.
        * lib/target-supports.exp: Like wise.
        * gcc.target/loongarch/loongarch.exp: New file.
        * gcc.target/loongarch/tst-asm-const.c: Like wise.
---
 gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C    |  2 +-
 gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C   |  2 +-
 gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C   |  2 +-
 gcc/testsuite/gcc.dg/20020312-2.c             |  2 +
 gcc/testsuite/gcc.dg/loop-8.c                 |  2 +-
 .../torture/stackalign/builtin-apply-2.c      |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c     |  2 +-
 .../gcc.target/loongarch/loongarch.exp        | 40 +++++++++++++++++++
 .../gcc.target/loongarch/tst-asm-const.c      | 16 ++++++++
 gcc/testsuite/go.test/go-test.exp             |  3 ++
 gcc/testsuite/lib/target-supports.exp         | 14 +++++++
 11 files changed, 81 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/loongarch.exp
 create mode 100644 gcc/testsuite/gcc.target/loongarch/tst-asm-const.c

diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C
index 2e0ef685f36..424979a604b 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C
@@ -1,6 +1,6 @@
 // PR c++/49673: check that test_data goes into .rodata
 // { dg-do compile { target c++11 } }
-// { dg-additional-options -G0 { target { { alpha*-*-* frv*-*-* ia64-*-* lm32*-*-* m32r*-*-* microblaze*-*-* mips*-*-* nios2-*-* powerpc*-*-* rs6000*-*-* } && { ! { *-*-darwin* *-*-aix* alpha*-*-*vms* } } } } }
+// { dg-additional-options -G0 { target { { alpha*-*-* frv*-*-* ia64-*-* lm32*-*-* m32r*-*-* microblaze*-*-* mips*-*-* loongarch*-*-* nios2-*-* powerpc*-*-* rs6000*-*-* } && { ! { *-*-darwin* *-*-aix* alpha*-*-*vms* } } } } }
 // { dg-final { scan-assembler "\\.rdata" { target mips*-*-* } } }
 // { dg-final { scan-assembler "rodata" { target { { *-*-linux-gnu *-*-gnu* *-*-elf } && { ! { mips*-*-* riscv*-*-* } } } } } }
 
diff --git a/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C b/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C
index bda7960d8a2..f69000e9081 100644
--- a/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C
+++ b/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C
@@ -7,7 +7,7 @@
    function.  However, some platforms use all bits to encode a
    function pointer.  Such platforms use the lowest bit of the delta,
    that is shifted left by one bit.  */
-#if defined __MN10300__ || defined __SH5__ || defined __arm__ || defined __thumb__ || defined __mips__ || defined __aarch64__ || defined __PRU__
+#if defined __MN10300__ || defined __SH5__ || defined __arm__ || defined __thumb__ || defined __mips__ || defined __aarch64__ || defined __PRU__ || defined __loongarch__
 #define ADJUST_PTRFN(func, virt) ((void (*)())(func))
 #define ADJUST_DELTA(delta, virt) (((delta) << 1) + !!(virt))
 #else
diff --git a/gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C b/gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C
index 9f4bbe43f89..8f8f7017ab7 100644
--- a/gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C
+++ b/gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C
@@ -25,7 +25,7 @@ int main() {
   h<&B::j>(); // { dg-error "" } 
   g<(void (A::*)()) &A::f>(); // { dg-error "" "" { xfail c++11 } }
   h<(int A::*) &A::i>(); // { dg-error "" "" { xfail c++11 } }
-  g<(void (A::*)()) &B::f>(); // { dg-error "" "" { xfail { c++11 && { aarch64*-*-* arm*-*-* mips*-*-* } } } }
+  g<(void (A::*)()) &B::f>(); // { dg-error "" "" { xfail { c++11 && { aarch64*-*-* arm*-*-* mips*-*-* loongarch*-*-* } } } }
   h<(int A::*) &B::j>(); // { dg-error "" } 
   g<(void (A::*)()) 0>(); // { dg-error "" "" { target { ! c++11 } } }
   h<(int A::*) 0>(); // { dg-error "" "" { target { ! c++11 } } }
diff --git a/gcc/testsuite/gcc.dg/20020312-2.c b/gcc/testsuite/gcc.dg/20020312-2.c
index 52c33d09b90..92bc150df0f 100644
--- a/gcc/testsuite/gcc.dg/20020312-2.c
+++ b/gcc/testsuite/gcc.dg/20020312-2.c
@@ -37,6 +37,8 @@ extern void abort (void);
 /* PIC register is r1, but is used even without -fpic.  */
 #elif defined(__lm32__)
 /* No pic register.  */
+#elif defined(__loongarch__)
+/* No pic register.  */
 #elif defined(__M32R__)
 /* No pic register.  */
 #elif defined(__m68k__)
diff --git a/gcc/testsuite/gcc.dg/loop-8.c b/gcc/testsuite/gcc.dg/loop-8.c
index a685fc25056..8e5f2087831 100644
--- a/gcc/testsuite/gcc.dg/loop-8.c
+++ b/gcc/testsuite/gcc.dg/loop-8.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O1 -fdump-rtl-loop2_invariant" } */
-/* { dg-skip-if "unexpected IV" { "hppa*-*-* mips*-*-* visium-*-* powerpc*-*-* riscv*-*-* mmix-*-* vax-*-*" } } */
+/* { dg-skip-if "unexpected IV" { "hppa*-*-* mips*-*-* visium-*-* powerpc*-*-* riscv*-*-* mmix-*-* vax-*-* loongarch*-*-*" } } */
 /* Load immediate on condition is available from z13 on and prevents moving
    the load out of the loop, so always run this test with -march=zEC12 that
    does not have load immediate on condition.  */
diff --git a/gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c b/gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c
index 5ec05587dba..552ca1433f4 100644
--- a/gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c
+++ b/gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c
@@ -9,7 +9,7 @@
 /* arm_hf_eabi: Variadic funcs use Base AAPCS.  Normal funcs use VFP variant.
    avr: Variadic funcs don't pass arguments in registers, while normal funcs
         do.  */
-/* { dg-skip-if "Variadic funcs use different argument passing from normal funcs" { arm_hf_eabi || { csky*-*-* avr-*-* riscv*-*-* or1k*-*-* msp430-*-* amdgcn-*-* pru-*-* } } } */
+/* { dg-skip-if "Variadic funcs use different argument passing from normal funcs" { arm_hf_eabi || { csky*-*-* avr-*-* riscv*-*-* or1k*-*-* msp430-*-* amdgcn-*-* pru-*-* loongarch*-*-* } } } */
 /* { dg-skip-if "Variadic funcs have all args on stack. Normal funcs have args in registers." { nds32*-*-* } { v850*-*-* } } */
 /* { dg-require-effective-target untyped_assembly } */
    
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
index 6b6255b9713..224dd4f72ef 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
@@ -5,7 +5,7 @@
 
    When the condition is true, we distribute "(int) (a + b)" as
    "(int) a + (int) b", otherwise we keep the original.  */
-/* { dg-do compile { target { ! mips64 } } } */
+/* { dg-do compile { target { ! mips64 } && { ! loongarch64 } } } */
 /* { dg-options "-O -fno-tree-forwprop -fno-tree-ccp -fwrapv -fdump-tree-fre1-details" } */
 
 /* From PR14844.  */
diff --git a/gcc/testsuite/gcc.target/loongarch/loongarch.exp b/gcc/testsuite/gcc.target/loongarch/loongarch.exp
new file mode 100644
index 00000000000..41977c8e402
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/loongarch.exp
@@ -0,0 +1,40 @@
+# Copyright (C) 2021-2022 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't a LoongArch target.
+if ![istarget loongarch*-*-*] then {
+  return
+}
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# If a testcase doesn't have special options, use these.
+global DEFAULT_CFLAGS
+if ![info exists DEFAULT_CFLAGS] then {
+    set DEFAULT_CFLAGS " "
+}
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \
+	"" $DEFAULT_CFLAGS
+# All done.
+dg-finish
diff --git a/gcc/testsuite/gcc.target/loongarch/tst-asm-const.c b/gcc/testsuite/gcc.target/loongarch/tst-asm-const.c
new file mode 100644
index 00000000000..2e04b99e301
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/tst-asm-const.c
@@ -0,0 +1,16 @@
+/* Test asm const. */
+/* { dg-do compile } */
+/* { dg-final { scan-assembler-times "foo:.*\\.long 1061109567.*\\.long 52" 1 } } */
+int foo ()
+{
+  __asm__ volatile (
+          "foo:"
+          "\n\t"
+	  ".long %a0\n\t"
+	  ".long %a1\n\t"
+	  :
+	  :"i"(0x3f3f3f3f), "i"(52)
+	  :
+	  );
+}
+
diff --git a/gcc/testsuite/go.test/go-test.exp b/gcc/testsuite/go.test/go-test.exp
index a402388ee9d..11c178ad7ec 100644
--- a/gcc/testsuite/go.test/go-test.exp
+++ b/gcc/testsuite/go.test/go-test.exp
@@ -232,6 +232,9 @@ proc go-set-goarch { } {
 	"riscv64-*-*" {
 	    set goarch "riscv64"
 	}
+	"loongarch64-*-*" {
+	    set goarch "loongarch64"
+	}
 	"s390*-*-*" {
 	    if [check_effective_target_ilp32] {
 		set goarch "s390"
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 737e1a8913b..843b508b010 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -286,6 +286,10 @@ proc check_configured_with { pattern } {
 proc check_weak_available { } {
     global target_cpu
 
+    if { [ string first "loongarch" $target_cpu ] >= 0 } {
+        return 1
+    }
+
     # All mips targets should support it
 
     if { [ string first "mips" $target_cpu ] >= 0 } {
@@ -1297,6 +1301,14 @@ proc check_effective_target_mpaired_single { } {
 # Return true if the target has access to FPU instructions.
 
 proc check_effective_target_hard_float { } {
+    if { [istarget loongarch*-*-*] } {
+	return [check_no_compiler_messages hard_float assembly {
+		#if (defined __loongarch_soft_float)
+		#error __loongarch_soft_float
+		#endif
+	}]
+    }
+
     if { [istarget mips*-*-*] } {
 	return [check_no_compiler_messages hard_float assembly {
 		#if (defined __mips_soft_float || defined __mips16)
@@ -8616,6 +8628,7 @@ proc check_effective_target_sync_char_short { } {
 	     || [istarget cris-*-*]
 	     || ([istarget sparc*-*-*] && [check_effective_target_sparc_v9])
 	     || ([istarget arc*-*-*] && [check_effective_target_arc_atomic])
+	     || [istarget loongarch*-*-*]
 	     || [check_effective_target_mips_llsc] }}]
 }
 
@@ -10708,6 +10721,7 @@ proc check_effective_target_branch_cost {} {
 	 || [istarget epiphany*-*-*]
 	 || [istarget frv*-*-*]
 	 || [istarget i?86-*-*] || [istarget x86_64-*-*]
+	 || [istarget loongarch*-*-*]
 	 || [istarget mips*-*-*]
 	 || [istarget s390*-*-*]
 	 || [istarget riscv*-*-*]
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v8 12/12] LoongArch Port: Add doc.
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (10 preceding siblings ...)
  2022-03-04  7:18 ` [PATCH v8 11/12] LoongArch Port: gcc/testsuite xuchenghua
@ 2022-03-04  7:18 ` xuchenghua
  2022-03-08 19:53   ` Richard Sandiford
  2022-03-04 20:02 ` [PATCH v8 00/12] Add LoongArch support Xi Ruoyao
  2022-03-15 14:35 ` Xi Ruoyao
  13 siblings, 1 reply; 34+ messages in thread
From: xuchenghua @ 2022-03-04  7:18 UTC (permalink / raw)
  To: gcc-patches; +Cc: joseph, paul.hua.gm, xuchenghua, chenglulu

From: chenglulu <chenglulu@loongson.cn>

2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
	    Lulu Cheng  <chenglulu@loongson.cn>

	* contrib/config-list.mk: Add LoongArch triplet.
	* gcc/doc/install.texi: Add LoongArch options section.
	* gcc/doc/invoke.texi: Add LoongArch options section.
	* gcc/doc/md.texi: Add LoongArch options section.
---
 contrib/config-list.mk |   5 +-
 gcc/doc/install.texi   |  47 +++++++++-
 gcc/doc/invoke.texi    | 202 +++++++++++++++++++++++++++++++++++++++++
 gcc/doc/md.texi        |  55 +++++++++++
 4 files changed, 303 insertions(+), 6 deletions(-)

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index 3e1d1321861..ba6f12e4693 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -57,7 +57,10 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   i686-wrs-vxworksae \
   i686-cygwinOPT-enable-threads=yes i686-mingw32crt ia64-elf \
   ia64-freebsd6 ia64-linux ia64-hpux ia64-hp-vms iq2000-elf lm32-elf \
-  lm32-rtems lm32-uclinux m32c-rtems m32c-elf m32r-elf m32rle-elf \
+  lm32-rtems lm32-uclinux \
+  loongarch64-linux-gnu loongarch64-linux-gnuf64 \
+  loongarch64-linux-gnuf32 loongarch64-linux-gnusf \
+  m32c-rtems m32c-elf m32r-elf m32rle-elf \
   m68k-elf m68k-netbsdelf \
   m68k-uclinux m68k-linux m68k-rtems \
   mcore-elf microblaze-linux microblaze-elf \
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 7258f9def6c..5fb55b1d064 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -747,9 +747,9 @@ Here are the possible CPU types:
 @quotation
 aarch64, aarch64_be, alpha, alpha64, amdgcn, arc, arceb, arm, armeb, avr, bfin,
 bpf, cr16, cris, csky, epiphany, fido, fr30, frv, ft32, h8300, hppa, hppa2.0,
-hppa64, i486, i686, ia64, iq2000, lm32, m32c, m32r, m32rle, m68k, mcore,
-microblaze, microblazeel, mips, mips64, mips64el, mips64octeon, mips64orion,
-mips64vr, mipsel, mipsisa32, mipsisa32r2, mipsisa64, mipsisa64r2,
+hppa64, i486, i686, ia64, iq2000, lm32, loongarch64, m32c, m32r, m32rle, m68k,
+mcore, microblaze, microblazeel, mips, mips64, mips64el, mips64octeon,
+mips64orion, mips64vr, mipsel, mipsisa32, mipsisa32r2, mipsisa64, mipsisa64r2,
 mipsisa64r2el, mipsisa64sb1, mipsisa64sr71k, mipstx39, mmix, mn10300, moxie,
 msp430, nds32be, nds32le, nios2, nvptx, or1k, pdp11, powerpc, powerpc64,
 powerpc64le, powerpcle, pru, riscv32, riscv32be, riscv64, riscv64be, rl78, rx,
@@ -1166,8 +1166,9 @@ sysv, aix.
 @itemx --without-multilib-list
 Specify what multilibs to build.  @var{list} is a comma separated list of
 values, possibly consisting of a single value.  Currently only implemented
-for aarch64*-*-*, arm*-*-*, riscv*-*-*, sh*-*-* and x86-64-*-linux*.  The
-accepted values and meaning for each target is given below.
+for aarch64*-*-*, arm*-*-*, loongarch64-*-*, riscv*-*-*, sh*-*-* and
+x86-64-*-linux*.  The accepted values and meaning for each target is given
+below.
 
 @table @code
 @item aarch64*-*-*
@@ -1254,6 +1255,14 @@ profile.  The union of these options is considered when specifying both
 @code{-mfloat-abi=hard}
 @end multitable
 
+@item loongarch*-*-*
+@var{list} is a comma-separated list of the following ABI identifiers:
+@code{lp64d[/base]} @code{lp64f[/base]} @code{lp64d[/base]}, where the
+@code{/base} suffix may be omitted, to enable their respective run-time
+libraries.  If @var{list} is empty, @code{default}
+or @option{--with-multilib-list} is not specified, then the default ABI
+as specified by @option{--with-abi} or implied by @option{--target} is selected.
+
 @item riscv*-*-*
 @var{list} is a single ABI name.  The target architecture must be either
 @code{rv32gc} or @code{rv64gc}.  This will build a single multilib for the
@@ -4439,6 +4448,34 @@ This configuration is intended for embedded systems.
 Lattice Mico32 processor.
 This configuration is intended for embedded systems running uClinux.
 
+@html
+<hr />
+@end html
+@anchor{loongarch}
+@heading LoongArch
+LoongArch processor.
+The following LoongArch targets are available:
+@table @code
+@item loongarch64-linux-gnu*
+LoongArch processor running GNU/Linux.  This target triplet may be coupled
+with a small set of possible suffixes to identify their default ABI type:
+@table @code
+@item f64
+Uses @code{lp64d/base} ABI by default.
+@item f32
+Uses @code{lp64f/base} ABI by default.
+@item sf
+Uses @code{lp64s/base} ABI by default.
+@end table
+
+@item loongarch64-linux-gnu
+Same as @code{loongarch64-linux-gnuf64}, but may be used with
+@option{--with-abi=*} to configure the default ABI type.
+@end table
+
+More information about LoongArch can be found at
+@uref{https://github.com/loongson/LoongArch-Documentation}.
+
 @html
 <hr />
 @end html
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 248ed534aee..d884b30b96e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -995,6 +995,16 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{-mbarrel-shift-enabled  -mdivide-enabled  -mmultiply-enabled @gol
 -msign-extend-enabled  -muser-enabled}
 
+@emph{LoongArch Options}
+@gccoptlist{-march=@var{cpu-type}  -mtune=@var{cpu-type} -mabi=@var{base-abi-type} @gol
+-mfpu=@var{fpu-type} -msoft-float -msingle-float -mdouble-float @gol
+-mbranch-cost=@var{n}  -mcheck-zero-division -mno-check-zero-division @gol
+-mcond-move-int  -mno-cond-move-int @gol
+-mcond-move-float  -mno-cond-move-float @gol
+-memcpy  -mno-memcpy -mstrict-align -mno-strict-align @gol
+-mmax-inline-memcpy-size=@var{n} @gol
+-mlra -mcmodel=@var{code-model}}
+
 @emph{M32R/D Options}
 @gccoptlist{-m32r2  -m32rx  -m32r @gol
 -mdebug @gol
@@ -18863,6 +18873,7 @@ platform.
 * HPPA Options::
 * IA-64 Options::
 * LM32 Options::
+* LoongArch Options::
 * M32C Options::
 * M32R/D Options::
 * M680x0 Options::
@@ -24378,6 +24389,197 @@ Enable user-defined instructions.
 
 @end table
 
+@node LoongArch Options
+@subsection LoongArch Options
+@cindex LoongArch Options
+
+These command-line options are defined for LoongArch targets:
+
+@table @gcctabopt
+@item -march=@var{cpu-type}
+@opindex -march
+Generate instructions for the machine type @var{cpu-type}.  In contrast to
+@option{-mtune=@var{cpu-type}}, which merely tunes the generated code
+for the specified @var{cpu-type}, @option{-march=@var{cpu-type}} allows GCC
+to generate code that may not run at all on processors other than the one
+indicated.  Specifying @option{-march=@var{cpu-type}} implies
+@option{-mtune=@var{cpu-type}}, except where noted otherwise.
+
+The choices for @var{cpu-type} are:
+
+@table @samp
+@item native
+This selects the CPU to generate code for at compilation time by determining
+the processor type of the compiling machine.  Using @option{-march=native}
+enables all instruction subsets supported by the local machine (hence
+the result might not run on different machines).  Using @option{-mtune=native}
+produces code optimized for the local machine under the constraints
+of the selected instruction set.
+@item loongarch64
+A generic CPU with 64-bit extensions.
+@item la464
+LoongArch LA464 CPU with LBT, LSX, LASX, LVZ.
+@end table
+
+
+@item -mtune=@var{cpu-type}
+@opindex mtune
+Optimize the output for the given processor, specified by microarchitecture
+name.
+
+@item -mabi=@var{base-abi-type}
+@opindex mabi
+Generate code for the specified calling convention. @gol
+Set base ABI to one of: @gol
+@table @samp
+@item lp64d
+Uses 64-bit general purpose registers and 32/64-bit floating-point
+registers for parameter passing.  Data model is LP64, where int
+is 32 bits, while long int and pointers are 64 bits.
+@item lp64f
+Uses 64-bit general purpose registers and 32-bit floating-point
+registers for parameter passing.  Data model is LP64, where int
+is 32 bits, while long int and pointers are 64 bits.
+@item lp64s
+Uses 64-bit general purpose registers and no floating-point
+registers for parameter passing.  Data model is LP64, where int
+is 32 bits, while long int and pointers are 64 bits.
+@end table
+
+
+@item -mfpu=@var{fpu-type}
+@opindex mfpu
+Generating code for the specified FPU type: @gol
+@table @samp
+@item 64
+Allow the use of hardware floating-point instructions for 32-bit
+and 64-bit operations.
+@item 32
+Allow the use of hardware floating-point instructions for 32-bit
+operations.
+@item none
+@item 0
+Prevent the use of hardware floating-point instructions.
+@end table
+
+
+@item -msoft-float
+@opindex msoft-float
+Force @option{-mfpu=none} and prevents the use of floating-point
+registers for parameter passing.  This option may change the target
+ABI.
+
+@item -msingle-float
+@opindex -msingle-float
+Force @option{-mfpu=32} and allow the use of 32-bit floating-point
+registers for parameter passing.  This option may change the target
+ABI.
+
+@item -mdouble-float
+@opindex -mdouble-float
+Force @option{-mfpu=64} and allow the use of 32/64-bit floating-point
+registers for parameter passing.  This option may change the target
+ABI.
+
+
+@item -mbranch-cost=@var{n}
+@opindex -mbranch-cost
+Set the cost of branches to roughly n instructions.
+
+@item -mcheck-zero-division
+@itemx -mno-check-zero-divison
+@opindex -mcheck-zero-division
+Trap (do not trap) on integer division by zero. The default is '-mcheck-zero-
+division'.
+
+
+@item -mcond-move-int
+@itemx -mno-cond-move-int
+@opindex -mcond-move-int
+Conditional moves for integer are enabled (disabled). The default is
+'-mcond-move-float'.
+
+@item -mmemcpy
+@itemx -mno-memcpy
+@opindex -mmemcpy
+Force (do not force) the use of memcpy for non-trivial block moves. The default
+is '-mno-memcpy', which allows GCC to inline most constant-sized copies.
+
+
+@item -mlra
+@opindex -mlra
+Use the new LRA register allocator. By default, the LRA is used.
+
+@item -mstrict-align
+@itemx -mno-strict-align
+@opindex -mstrict-align
+Avoid or allow generating memory accesses that may not be aligned on a natural
+object boundary as described in the architecture specification. The default is
+'-mno-strict-align'.
+
+@item -msmall-data-limit=@var{number}
+@opindex -msmall-data-limit
+Put global and static data smaller than @code{number} bytes into a special section (on some targets).
+Default value is 0.
+
+
+@item -mmax-inline-memcpy-size=@var{n}
+@opindex -mmax-inline-memcpy-size
+Set the max size n of memcpy to inline, default n is 1024.
+
+@item -mcmodel=@var{code-model}
+Default code model is normal.
+Set the code model to one of:
+@table @samp
+@item tiny-static
+@itemize @bullet
+@item
+local symbol and global strong symbol: The data section must be within +/-2MiB addressing space.
+The text section must be within +/-128MiB addressing space.
+@item
+global weak symbol: The got table must be within +/-2GiB addressing space.
+@end itemize
+
+@item tiny
+@itemize @bullet
+@item
+local symbol: The data section must be within +/-2MiB addressing space.
+The text section must be within +/-128MiB
+addressing space.
+@item
+global symbol: The got table must be within +/-2GiB addressing space.
+@end itemize
+
+@item normal
+@itemize @bullet
+@item
+local symbol: The data section must be within +/-2GiB addressing space.
+The text section must be within +/-128MiB addressing space.
+@item
+global symbol: The got table must be within +/-2GiB addressing space.
+@end itemize
+
+@item large
+@itemize @bullet
+@item
+local symbol: The data section must be within +/-2GiB addressing space.
+The text section must be within +/-128GiB addressing space.
+@item
+global symbol: The got table must be within +/-2GiB addressing space.
+@end itemize
+
+@item extreme(Not implemented yet)
+@itemize @bullet
+@item
+local symbol: The data and text section must be within +/-8EiB addressing space.
+@item
+global symbol: The data got table must be within +/-8EiB addressing space.
+@end itemize
+@end table
+@end table
+
+
+
 @node M32C Options
 @subsection M32C Options
 @cindex M32C options
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index f3619c505c0..98c2bd67a80 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -2747,6 +2747,61 @@ Memory addressed using the small base register ($sb).
 $r1h
 @end table
 
+@item LoongArch---@file{config/loongarch/constraints.md}
+@table @code
+@item a
+A constant call global and noplt address.
+@item c
+A constant call local address.
+@item e
+A register that is used as function call.
+@item f
+A floating-point register (if available).
+@item h
+A constant call plt address.
+@item j
+A rester that is used as sibing call.
+@item k
+A memory operand whose address is formed by a base register and
+(optionally scaled) index register.
+@item l
+A signed 16-bit constant.
+@item m
+A memory operand whose address is formed by a base register and offset
+that is suitable for use in instructions with the same addressing mode
+as @code{st.w} and @code{ld.w}.
+@item q
+A general-purpose register except for $r0 and $r1 for csr instructions.
+@item t
+A constant call weak address.
+@item u
+A signed 52bit constant and low 32-bit is zero (for logic instructions).
+@item v
+A nsigned 64-bit constant and low 44-bit is zero (for logic instructions).
+@item w
+Matches any valid memory.
+@item z
+A floating-point condition code register.
+@item G
+Floating-point zero.
+@item I
+A signed 12-bit constant (for arithmetic instructions).
+@item J
+Integer zero.
+@item K
+An unsigned 12-bit constant (for logic instructions).
+@item Yd
+A constant @code{move_operand} that can be safely loaded using
+@code{la}.
+@item ZB
+An address that is held in a general-purpose register.
+The offset is zero.
+@item ZC
+A memory operand whose address is formed by a base register and offset
+that is suitable for use in instructions with the same addressing mode
+as @code{ll.w} and @code{sc.w}.
+@end table
+
 @item MicroBlaze---@file{config/microblaze/constraints.md}
 @table @code
 @item d
-- 
2.31.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 00/12] Add LoongArch support.
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (11 preceding siblings ...)
  2022-03-04  7:18 ` [PATCH v8 12/12] LoongArch Port: Add doc xuchenghua
@ 2022-03-04 20:02 ` Xi Ruoyao
  2022-03-05  1:36   ` Paul Hua
  2022-03-08 19:57   ` Richard Sandiford
  2022-03-15 14:35 ` Xi Ruoyao
  13 siblings, 2 replies; 34+ messages in thread
From: Xi Ruoyao @ 2022-03-04 20:02 UTC (permalink / raw)
  To: xuchenghua, gcc-patches; +Cc: chenglulu, joseph

On Fri, 2022-03-04 at 15:17 +0800, xuchenghua@loongson.cn wrote:

> The binutils has been merged into trunk:
> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=560b3fe208255ae909b4b1c88ba9c28b09043307
> 
> Note: We split -mabi= into -mabi=lp64d/f/s, the new options not support by upstream binutils yet,
> this GCC port requires the following patch applied to binutils to build.
> https://github.com/loongson/binutils-gdb/commit/aacb0bf860f02aa5a7dcb76dd0e392bf871c7586
> (will be submitted to upstream after gcc side comfirmed)

I think you don't need a review for binutils change here.  You should
get it reviewed and applied in binutils-gdb ASAP.  Then in install.texi
you would add a note like "loongarch64-*-* requires binutils >= 2.39" in
"Target specific installation notes", as an unpatched 2.38 does not
work.

And based on the history of RISC-V port
(https://gcc.gnu.org/pipermail/gcc/2017-January/222595.html) the process
for a new port seems:

1. Get a permission from the Steering Committee.
2. Add one or two port maintainers into MAINTAINERS file.
3. Now the technical reviewing of the patch series just begin.


I'm not an expert in software engineering (or social interaction :) and
I don't know if the process has been changed in these years.
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 00/12] Add LoongArch support.
  2022-03-04 20:02 ` [PATCH v8 00/12] Add LoongArch support Xi Ruoyao
@ 2022-03-05  1:36   ` Paul Hua
  2022-03-05  1:39     ` Paul Hua
  2022-03-05  9:21     ` Xi Ruoyao
  2022-03-08 19:57   ` Richard Sandiford
  1 sibling, 2 replies; 34+ messages in thread
From: Paul Hua @ 2022-03-05  1:36 UTC (permalink / raw)
  To: Xi Ruoyao; +Cc: Xu Chenghua, gcc-patches, chenglulu, Joseph Myers

>
> And based on the history of RISC-V port
> (https://gcc.gnu.org/pipermail/gcc/2017-January/222595.html) the process
> for a new port seems:
>
> 1. Get a permission from the Steering Committee.
> 2. Add one or two port maintainers into MAINTAINERS file.
> 3. Now the technical reviewing of the patch series just begin.
>

Hi Ruoyao,
Thanks for your advice.  But I don't know how to contact the GCC
Steering Committee.

Hi David,
Any suggestions?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 00/12] Add LoongArch support.
  2022-03-05  1:36   ` Paul Hua
@ 2022-03-05  1:39     ` Paul Hua
  2022-03-05  9:21     ` Xi Ruoyao
  1 sibling, 0 replies; 34+ messages in thread
From: Paul Hua @ 2022-03-05  1:39 UTC (permalink / raw)
  To: Xi Ruoyao, dje.gcc; +Cc: Xu Chenghua, gcc-patches, chenglulu, Joseph Myers

> > And based on the history of RISC-V port
> > (https://gcc.gnu.org/pipermail/gcc/2017-January/222595.html) the process
> > for a new port seems:
> >
> > 1. Get a permission from the Steering Committee.
> > 2. Add one or two port maintainers into MAINTAINERS file.
> > 3. Now the technical reviewing of the patch series just begin.
> >
>
> Hi Ruoyao,
> Thanks for your advice.  But I don't know how to contact the GCC
> Steering Committee.
>
> Hi David,
> Any suggestions?
Sorry, CCed David Edelsohn.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 00/12] Add LoongArch support.
  2022-03-05  1:36   ` Paul Hua
  2022-03-05  1:39     ` Paul Hua
@ 2022-03-05  9:21     ` Xi Ruoyao
  1 sibling, 0 replies; 34+ messages in thread
From: Xi Ruoyao @ 2022-03-05  9:21 UTC (permalink / raw)
  To: Paul Hua; +Cc: Xu Chenghua, gcc-patches, chenglulu, Joseph Myers

On Sat, 2022-03-05 at 09:36 +0800, Paul Hua wrote:
> > 
> > And based on the history of RISC-V port
> > (https://gcc.gnu.org/pipermail/gcc/2017-January/222595.html) the
> > process
> > for a new port seems:
> > 
> > 1. Get a permission from the Steering Committee.
> > 2. Add one or two port maintainers into MAINTAINERS file.
> > 3. Now the technical reviewing of the patch series just begin.
> > 
> 
> Hi Ruoyao,
> Thanks for your advice.  But I don't know how to contact the GCC
> Steering Committee.

https://gcc.gnu.org/steering.html

It says the best place to reach the SC members is the list
gcc@gcc.gnu.org.

And Joseph is also a SC member.
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 02/12] LoongArch Port: gcc build
  2022-03-04  7:17 ` [PATCH v8 02/12] LoongArch Port: gcc build xuchenghua
@ 2022-03-06 10:29   ` Richard Sandiford
  2022-03-06 10:34     ` Andreas Schwab
  0 siblings, 1 reply; 34+ messages in thread
From: Richard Sandiford @ 2022-03-06 10:29 UTC (permalink / raw)
  To: xuchenghua; +Cc: gcc-patches, chenglulu, joseph

Hi,

Thanks for the submission.  Some comments below on this patch,
but otherwise it looks good.  I hope to get to the other patches
in the series soon.

I haven't followed all of the previous discussion, so some of these
points might already have been discussed.  Sorry in advance if so.

xuchenghua@loongson.cn writes:
> diff --git a/contrib/gcc_update b/contrib/gcc_update
> index 1cf15f9b3c2..641ce164775 100755
> --- a/contrib/gcc_update
> +++ b/contrib/gcc_update
> @@ -86,6 +86,8 @@ gcc/config/arm/arm-tables.opt: gcc/config/arm/arm-cpus.in gcc/config/arm/parsecp
>  gcc/config/c6x/c6x-tables.opt: gcc/config/c6x/c6x-isas.def gcc/config/c6x/genopt.sh
>  gcc/config/c6x/c6x-sched.md: gcc/config/c6x/c6x-sched.md.in gcc/config/c6x/gensched.sh
>  gcc/config/c6x/c6x-mult.md: gcc/config/c6x/c6x-mult.md.in gcc/config/c6x/genmult.sh
> +gcc/config/loongarch/loongarch-str.h: gcc/config/loongarch/genopts/genstr.sh gcc/config/loongarch/genopts/loongarch-string
> +gcc/config/loongarch/loongarch.opt: gcc/config/loongarch/genopts/genstr.sh gcc/config/loongarch/genopts/loongarch.opt.in
>  gcc/config/m68k/m68k-tables.opt: gcc/config/m68k/m68k-devices.def gcc/config/m68k/m68k-isas.def gcc/config/m68k/m68k-microarchs.def gcc/config/m68k/genopt.sh
>  gcc/config/mips/mips-tables.opt: gcc/config/mips/mips-cpus.def gcc/config/mips/genopt.sh
>  gcc/config/rs6000/rs6000-tables.opt: gcc/config/rs6000/rs6000-cpus.def gcc/config/rs6000/genopt.sh
> diff --git a/gcc/common/config/loongarch/loongarch-common.cc b/gcc/common/config/loongarch/loongarch-common.cc
> new file mode 100644
> index 00000000000..5bdfd2a30e1
> --- /dev/null
> +++ b/gcc/common/config/loongarch/loongarch-common.cc
> @@ -0,0 +1,73 @@
> +/* Common hooks for LoongArch.
> +   Copyright (C) 2021-2022 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tm.h"
> +#include "common/common-target.h"
> +#include "common/common-target-def.h"
> +#include "opts.h"
> +#include "flags.h"
> +#include "diagnostic-core.h"
> +
> +#undef	TARGET_OPTION_OPTIMIZATION_TABLE
> +#define TARGET_OPTION_OPTIMIZATION_TABLE loongarch_option_optimization_table
> +
> +/* Set default optimization options.  */
> +static const struct default_options loongarch_option_optimization_table[] =
> +{
> +  { OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
> +  { OPT_LEVELS_NONE, 0, NULL, 0 }
> +};
> +
> +/* Implement TARGET_HANDLE_OPTION.  */
> +
> +static bool
> +loongarch_handle_option (struct gcc_options *opts,
> +			 struct gcc_options *opts_set ATTRIBUTE_UNUSED,
> +			 const struct cl_decoded_option *decoded,
> +			 location_t loc ATTRIBUTE_UNUSED)
> +{
> +  size_t code = decoded->opt_index;
> +  int value = decoded->value;
> +
> +  switch (code)
> +    {
> +    case OPT_mmemcpy:
> +      if (value)
> +	{
> +	  if (opts->x_optimize_size)
> +	    opts->x_target_flags |= MASK_MEMCPY;
> +	}
> +      else
> +	opts->x_target_flags &= ~MASK_MEMCPY;
> +      return true;

I think this will make the option order-dependent: -mmemcpy -Os
could behave differently from -Os -mmemcpy.  I'm also not sure
it would trigger if the optimisation level is changed by the
source code, e.g. using __attribute__((optimize)).

If -mmemcpy is only supposed to have an effect when optimising
for size, it would probably be better to have a preprocessor
macro in loongson.h along the lines of:

#define TARGET_MEMCPY (TARGET_RAW_MEMCPY && optimize_size)

with s/Mask(MEMCPY)/Mask(RAW_MEMCPY)/ in the .opt file.

> +
> +    default:
> +      return true;
> +    }
> +}
> +
> +#undef TARGET_DEFAULT_TARGET_FLAGS
> +#define TARGET_DEFAULT_TARGET_FLAGS	MASK_CHECK_ZERO_DIV
> +#undef TARGET_HANDLE_OPTION
> +#define TARGET_HANDLE_OPTION loongarch_handle_option
> +
> +struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
> […]
> +		# Inferring ISA-related default options from the ABI: pass 2
> +		case ${with_abi}/${with_abiext} in
> +		lp64d/base)
> +			fpu_pattern="64"
> +			;;
> +		lp64f/base)
> +			fpu_pattern="32|64"
> +			fpu_default="32"
> +			;;
> +		lp64s/base)
> +			fpu_pattern="none|32|64"
> +			fpu_default="none"
> +			;;
> +		*)
> +			echo "Unsupported ABI type ${with_abi}/${with_abiext}." 1>&2
> +			exit 1
> +			;;
> +		esac
> +
> […]
> +		## Setting default value for with_fpu.
> +		if test x${with_fpu} == xnone; then
> +			with_fpu="0"
> +		fi

I don't understand this.  Won't it always trigger an error…

> +
> +		case ${with_fpu} in
> +		"")
> +			if test x${fpu_default} != x; then
> +				with_fpu=${fpu_default}
> +			else
> +				with_fpu=${fpu_pattern}
> +			fi
> +			;;
> +
> +		*)
> +			if echo "${with_fpu}" | grep -E "^${fpu_pattern}$"; then
> +				: # OK
> +			else
> +				echo "${with_abi}/${with_abiext} ABI cannot be implemented with" \
> +				"--with-fpu=${with_fpu}." 1>&2
> +				exit 1

…here?  It looks like the original “none” would have been accepted
for soft-float.

> +			fi
> +			;;
> +		esac
> +
> +
> +		# Inferring default with_tune from with_arch: pass 1
> +		case ${with_arch} in
> +		native)
> +			tune_pattern="*"
> +			tune_default="native"
> +			;;
> +		loongarch64)
> +			tune_pattern="loongarch64 | la464"

Does including the spaces work?  The earlier patterns didn't have
spaces, which seems more correct.

> +			tune_default="la464"
> +			;;
> +		*)
> +			# By default, $with_tune == $with_arch
> +			tune_pattern="$with_arch"
> +			;;
> +		esac
> +
> +		## Setting default value for with_tune.
> +		case ${with_tune} in
> +		"")
> +			if test x${tune_default} != x; then
> +				with_tune=${tune_default}
> +			else
> +				with_tune=${tune_pattern}
> +			fi
> +			;;
> +
> +		*)
> +			if echo "${with_tune}" | grep -E "^${tune_pattern}$"; then
> +				: # OK
> +			else
> +				echo "Incompatible options: --with-tune=${with_tune}" \
> +				"and --with-arch=${with_arch}." 1>&2
> +				exit 1
> +			fi
> +			;;
> +		esac
> +
> +
> +		# Perform final sanity checks.
> +		case ${with_arch} in
> +		loongarch64 | la464) ;; # OK, append here.
> +		native)
> +			if test x${host} != x${target}; then
> +				echo "--with-arch=native is illegal for cross-compiler." 1>&2
> +				exit 1
> +			fi
> +			;;
> +		"")
> +			echo "Please set a default value for \${with_arch}" \
> +			     "according to your target triplet \"${target}\"." 1>&2
> +			exit 1
> +			;;
> +		*)
> +			echo "Unknown arch in --with-arch=$with_arch" 1>&2
> +			exit 1
> +			;;
> +		esac
> +
> +		case ${with_abi} in
> +		lp64d | lp64f | lp64s) ;; # OK, append here.
> +		*)
> +			echo "Unsupported ABI given in --with-abi=$with_abi" 1>&2
> +			exit 1
> +			;;
> +		esac
> +
> +		case ${with_abiext} in
> +		base) ;; # OK, append here.
> +		*)
> +			echo "Unsupported ABI extention type $with_abiext" 1>&2
> +			exit 1
> +			;;
> +		esac
> +
> +		case ${with_fpu} in
> +		none | 32 | 64) ;; # OK, append here.
> +		*)
> +			echo "Unknown fpu type in --with-fpu=$with_fpu" 1>&2
> +			exit 1
> +			;;
> +		esac
> +
> +		# Handle --with-multilib-list.
> +		if test x${with_multilib_list} == x \
> +		   || test x${with_multilib_list} == xno \
> +		   || test x${with_multilib_list} == xdefault \
> +		   || test x${enable_multilib} != xyes; then
> +
> +			with_multilib_list="${with_abi}/${with_abiext}"
> +		fi
> +
> +		# Check if the configured default ABI combination is included in
> +		# ${with_multilib_list}.
> +		loongarch_multilib_list_sane=no
> +
> +		# This one goes to TM_MULTILIB_CONFIG, for use in t-linux.
> +		loongarch_multilib_list_make=""
> +
> +		# This one goes to tm_defines, for use in loongarch-driver.c.
> +		loongarch_multilib_list_c=""
> +
> +		for e in $(tr ',' ' ' <<< "${with_multilib_list}"); do
> +			e=($(tr '/' ' ' <<< "$e"))
> +
> +			# Base ABI type
> +			case ${e[0]} in
> +			lp64d) loongarch_multilib_list_c+="ABI_BASE_LP64D,";;
> +			lp64f) loongarch_multilib_list_c+="ABI_BASE_LP64F,";;
> +			lp64s) loongarch_multilib_list_c+="ABI_BASE_LP64S,";;
> +			*)
> +				echo "Unknown base ABI \"${e[0]}\" in --with-multilib-list." 1>&2
> +				exit 1
> +				;;
> +			esac
> +			loongarch_multilib_list_make+="mabi=${e[0]}"
> +
> +			# ABI extension type
> +			case ${e[1]} in
> +			"" | base)
> +				loongarch_multilib_list_make+=""
> +				loongarch_multilib_list_c+="ABI_EXT_BASE,"
> +				e[1]="base"
> +				;;
> +			*)
> +				echo "Unknown ABI extension \"${e[1]}\" in --with-multilib-list." 1>&2
> +				exit 1
> +				;;
> +			esac
> +
> +			case ${e[2]} in
> +			"") ;; # OK
> +			*)
> +				echo "Unknown ABI in --with-multilib-list." 1>&2
> +				exit 1
> +				;;
> +			esac
> +
> +			if test x${with_abi} != x && test x${with_abiext} != x; then
> +				if test x${e[0]} == x${with_abi} \
> +				&& test x${e[1]} == x${with_abiext}; then
> +					loongarch_multilib_list_sane=yes
> +				fi
> +			fi
> +
> +			loongarch_multilib_list_make+=","
> +		done
> +
> +		# Check if the default ABI combination is in the default list.
> +		if test x${loongarch_multilib_list_sane} == xno; then
> +			if test x${with_abiext} == xbase; then
> +				with_abiext=""
> +			else
> +				with_abiext="/${with_abiext}"
> +			fi
> +
> +			echo "Default ABI combination (${with_abi}${with_abiext})" \
> +			"not found in --with-multilib-list." 1>&2
> +			exit 1
> +		fi
> +
> +		# Remove the excessive appending comma.
> +		loongarch_multilib_list_c=${loongarch_multilib_list_c:0:-1}
> +		loongarch_multilib_list_make=${loongarch_multilib_list_make:0:-1}

I think ${…:…:…} is a bash-ism.  And although bash is probably the
system shell on the targets you care about, it's sometimes useful
to test GCC with SHELL=/bin/dash in order to flush out unportable
shell constructs.

Maybe use sed 's/,$//' instead, even though it's much clunkier
than your version.

> +		;;
> +
>  	nds32*-*-*)
>  		supported_defaults="arch cpu nds32_lib float fpu_config"
>  
> @@ -5370,6 +5733,51 @@ case ${target} in
>  		tmake_file="mips/t-mips $tmake_file"
>  		;;
>  
> +	loongarch*-*-*)
> +		# Export canonical triplet.
> +		tm_defines+=" LA_MULTIARCH_TRIPLET=${la_canonical_triplet}"

Similarly here, += is a bashism.  This unfortunately needs to be:

   tm_defines="${tm_defines} LA_MULTIARCH_TRIPLET=${la_canonical_triplet}"

instead.  Same for the rest of the patch.

> +
> +		# Define macro __DISABLE_MULTILIB if --disable-multilib
> +		tm_defines+=" TM_MULTILIB_LIST=${loongarch_multilib_list_c}"
> +		if test x$enable_multilib == xyes; then
> +			TM_MULTILIB_CONFIG="${loongarch_multilib_list_make}"
> +		else
> +			tm_defines+=" __DISABLE_MULTILIB"
> +		fi

__FOO macros are in the implementation namespace so GCC needs to avoid
defining them itself.  Maybe LA_DISABLE_MULTILIB instead?

> […]
> diff --git a/gcc/config/loongarch/genopts/genstr.sh b/gcc/config/loongarch/genopts/genstr.sh
> new file mode 100755
> index 00000000000..6c1211c9e92
> --- /dev/null
> +++ b/gcc/config/loongarch/genopts/genstr.sh
> @@ -0,0 +1,91 @@
> +#!/bin/sh
> +# A simple script that generates loongarch-str.h and loongarch.opt
> +# from genopt/loongarch-optstr.
> +#
> +# Copyright (C) 2021-2022 Free Software Foundation, Inc.
> +#
> +# This file is part of GCC.
> +#
> +# GCC is free software; you can redistribute it and/or modify it under
> +# the terms of the GNU General Public License as published by the Free
> +# Software Foundation; either version 3, or (at your option) any later
> +# version.
> +#
> +# GCC is distributed in the hope that it will be useful, but WITHOUT
> +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +# or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +# License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +cd "$(dirname "$0")"

Similarly here we need to avoid bash's $(...).  Would be worth trying
the script with /bin/dash to see whether it works (i.e. produces the same
output as bash).

> diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in b/gcc/config/loongarch/genopts/loongarch.opt.in
> new file mode 100644
> index 00000000000..01c8b3e5308
> --- /dev/null
> +++ b/gcc/config/loongarch/genopts/loongarch.opt.in
> @@ -0,0 +1,189 @@
> +; Generated by "genstr" from the template "loongarch.opt.in"
> +; and definitions from "loongarch-strings".
> +;
> +; Copyright (C) 2021-2022 Free Software Foundation, Inc.
> +;
> +; This file is part of GCC.
> +;
> +; GCC is free software; you can redistribute it and/or modify it under
> +; the terms of the GNU General Public License as published by the Free
> +; Software Foundation; either version 3, or (at your option) any later
> +; version.
> +;
> +; GCC is distributed in the hope that it will be useful, but WITHOUT
> +; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +; License for more details.
> +;
> +; You should have received a copy of the GNU General Public License
> +; along with GCC; see the file COPYING3.  If not see
> +; <http://www.gnu.org/licenses/>.
> +;
> +
> +; Variables (macros) that should be exported by loongarch.opt:
> +;   la_opt_switches,
> +;   la_opt_abi_base, la_opt_abi_ext,
> +;   la_opt_cpu_arch, la_opt_cpu_tune,
> +;   la_opt_fpu,
> +;   la_cmodel.
> +
> +HeaderInclude
> +config/loongarch/loongarch-opts.h
> +
> +HeaderInclude
> +config/loongarch/loongarch-str.h
> +
> +Variable
> +HOST_WIDE_INT la_opt_switches = 0
> +
> +; ISA related options
> +;; Base ISA
> +Enum
> +Name(isa_base) Type(int)
> +Basic ISAs of LoongArch:
> +
> +EnumValue
> +Enum(isa_base) String(@@STR_ISA_BASE_LA64V100@@) Value(ISA_BASE_LA64V100)
> +
> +
> +;; ISA extensions / adjustments
> +Enum
> +Name(isa_ext_fpu) Type(int)
> +FPU types of LoongArch:
> +
> +EnumValue
> +Enum(isa_ext_fpu) String(@@STR_ISA_EXT_NOFPU@@) Value(ISA_EXT_NOFPU)
> +
> +EnumValue
> +Enum(isa_ext_fpu) String(@@STR_ISA_EXT_FPU32@@) Value(ISA_EXT_FPU32)
> +
> +EnumValue
> +Enum(isa_ext_fpu) String(@@STR_ISA_EXT_FPU64@@) Value(ISA_EXT_FPU64)
> +
> +m@@OPTSTR_ISA_EXT_FPU@@=
> +Target RejectNegative Joined ToLower Enum(isa_ext_fpu) Var(la_opt_fpu) Init(M_OPTION_NOT_SEEN)
> +-m@@OPTSTR_ISA_EXT_FPU@@=FPU	Generate code for the given FPU.

For the record: I've not seen this kind of thing (with the @@
replacements) being done before, but I can see that it could be
a way of preventing typos creeping in.  It doesn't contradict the
“global” GCC style and so IMO it's a perfectly valid choice for
target maintainers to make.

> […]
> +mmemcpy
> +Target Mask(MEMCPY)
> +Don't optimize block moves.

Going back to the above, if the intention is that this option
only has an effect when optimising for size, the documentation
ought to say that.

> +
> +mlra
> +Target Var(loongarch_lra_flag) Init(1) Save
> +Use LRA instead of reload.

Please drop this option and only support LRA.  No new code should
use old reload, even as a default-off option.

> +
> +noasmopt
> +Driver

Do you need this?  It looks like a MIPS hangover.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 02/12] LoongArch Port: gcc build
  2022-03-06 10:29   ` Richard Sandiford
@ 2022-03-06 10:34     ` Andreas Schwab
  0 siblings, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2022-03-06 10:34 UTC (permalink / raw)
  To: xuchenghua; +Cc: gcc-patches, chenglulu, joseph, richard.sandiford

On Mär 06 2022, Richard Sandiford via Gcc-patches wrote:

> Similarly here we need to avoid bash's $(...).

$(...) is 100% POSIX.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 04/12] LoongArch Port: Machine description files.
  2022-03-04  7:18 ` [PATCH v8 04/12] LoongArch Port: Machine description files xuchenghua
@ 2022-03-06 16:16   ` Richard Sandiford
  2022-03-07  3:59     ` 程璐璐
  2022-03-19  9:40     ` 程璐璐
  2022-03-07 20:15   ` Xi Ruoyao
  1 sibling, 2 replies; 34+ messages in thread
From: Richard Sandiford @ 2022-03-06 16:16 UTC (permalink / raw)
  To: xuchenghua; +Cc: gcc-patches, chenglulu, joseph

Hi,

Some comments below, but otherwise it looks good to me.

xuchenghua@loongson.cn writes:
> […]
> +(define_memory_constraint "k"
> +  "A memory operand whose address is formed by a base register and (optionally scaled)
> +   index register."
> +  (and (match_code "mem")
> +       (not (match_test "loongarch_14bit_shifted_offset_address_p (XEXP (op, 0), mode)"))
> +       (not (match_test "loongarch_12bit_offset_address_p (XEXP (op, 0), mode)"))))

It's not really safe to test MEM addresses using only negative conditions.
There needs to be a positive condition too, even if it's only:

       (match_test "memory_address_addr_space_p (GET_MODE (op), XEXP (op, 0),
						 MEM_ADDR_SPACE (op))")))

(from common.md).

> […]
> +(define_constraint "v"
> +  "A nsigned 64-bit constant and low 44-bit is zero (for logic instructions)."

typo

> […]
> +(define_memory_constraint "ZB"
> +  "@internal
> +  An address that is held in a general-purpose register.
> +  The offset is zero"
> +  (and (match_code "mem")
> +       (match_test "GET_CODE (XEXP (op,0)) == REG")))

It'd be good to use things like REG_P in new code.

Formatting nit: should be a space after “op,”.

> […]
> +;; Pipeline descriptions.
> +;;
> +;; generic.md provides a fallback for processors without a specific
> +;; pipeline description.  It is derived from the old define_function_unit
> +;; version and uses the "alu" and "imuldiv" units declared below.
> +;;
> +;; Some of the processor-specific files are also derived from old
> +;; define_function_unit descriptions and simply override the parts of
> +;; generic.md that don't apply.  The other processor-specific files
> +;; are self-contained.

I don't think these last two paragraphs apply to the new code.
The MIPS generic.md was converted from a much older pipeline
description format.  The conversion meant the older processor
descriptions weren't self-contained and relied on a mixture
of processor-specific things (in their own file) and generic
things (in this file).

New processor descriptions should be self-contained as far
as possible, in terms of not sharing cpu units with other
processor descriptions.

> +(define_automaton "alu,imuldiv")
> +
> +(define_cpu_unit "alu" "alu")
> +(define_cpu_unit "imuldiv" "imuldiv")
> +
> +;; Ghost instructions produce no real code.
> +;; They exist purely to express an effect on dataflow.
> +(define_insn_reservation "ghost" 0
> +  (eq_attr "type" "ghost")
> +  "nothing")
> +
> +;; This file is derived from the old define_function_unit description.
> +;; Each reservation can be overridden on a processor-by-processor basis.

Same for this last comment.

The "ghost" reservation is inherently sharable because it doesn't
use any CPU units.  But for a new port, I think the other reservations
in this file should be conditional on a particular -mtune.

> […]
> diff --git a/gcc/config/loongarch/la464.md b/gcc/config/loongarch/la464.md
> new file mode 100644
> index 00000000000..ae3808b51bb
> --- /dev/null
> +++ b/gcc/config/loongarch/la464.md
> @@ -0,0 +1,132 @@
> +;; Pipeline model for LoongArch LA464 cores.
> +
> +;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
> +;; Contributed by Loongson Ltd.
> +
> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it
> +;; under the terms of the GNU General Public License as published
> +;; by the Free Software Foundation; either version 3, or (at your
> +;; option) any later version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT
> +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +;; License for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +;; Uncomment the following line to output automata for debugging.
> +;; (automata_option "v")
> +
> +;; Automaton for integer instructions.
> +(define_automaton "la464_a_alu")
> +
> +;; Automaton for floating-point instructions.
> +(define_automaton "la464_a_falu")
> +
> +;; Automaton for memory operations.
> +(define_automaton "la464_a_mem")
> +
> +;; Describe the resources.
> +
> +(define_cpu_unit "la464_alu1" "la464_a_alu")
> +(define_cpu_unit "la464_alu2" "la464_a_alu")
> +(define_cpu_unit "la464_mem1" "la464_a_mem")
> +(define_cpu_unit "la464_mem2" "la464_a_mem")
> +(define_cpu_unit "la464_falu1" "la464_a_falu")
> +(define_cpu_unit "la464_falu2" "la464_a_falu")
> +
> +;; Describe instruction reservations.
> +
> +(define_insn_reservation "la464_arith" 1
> +  (and (match_test "TARGET_ARCH_LA464")

Normally scheduling should be determined by -mtune (with a default
-mtune chosen by -march when no explicit -mtune is given).  So I was
surprised to see this testing TARGET_ARCH_* instead of TARGET_TUNE_*.

> +       (eq_attr "type" "arith,clz,const,logical,
> +			move,nop,shift,signext,slt"))
> +  "la464_alu1 | la464_alu2")
> […]
> +;; Main data type used by the insn
> +(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,OI,SF,DF,TF,FCC"
> +  (const_string "unknown"))

Do any patterns use mode==OI?  Reason for asking is that:

> […]
> +;; True if the main data type is four times of the size of a word.
> +(define_attr "qword_mode" "no,yes"
> +  (cond [(and (eq_attr "mode" "TI,TF")
> +	      (not (match_test "TARGET_64BIT")))
> +	 (const_string "yes")]
> +	(const_string "no")))
> +
> +;; True if the main data type is eight times of the size of a word.
> +(define_attr "oword_mode" "no,yes"
> +  (cond [(and (eq_attr "mode" "OI")
> +	      (not (match_test "TARGET_64BIT")))
> +	 (const_string "yes")]
> +	(const_string "no")))

…it seemed inconsistent to treat OI is 8 words on 32-bit targets
but not as 4 words (qword_mode) on 64-bit targets.

It looks like oword_mode and OI might not be used, in which case
it's probably easiest to remove them for now.

> +(define_attr "compression" "none,all"
> +  (const_string "none"))

Does anything use this, or is it a holdover from MIPS?  I could see some
definitions of the attribute later in the file, but there didn't seem
to be any users.

> […]
> +;; This mode iterator allows the QI HI SI and DI extension patterns to be

incomplete sentence.

> +(define_mode_iterator QHWD [QI HI SI (DI "TARGET_64BIT")])
> +
> +;; Iterator for hardware-supported floating-point modes.
> +(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
> +			    (DF "TARGET_DOUBLE_FLOAT")])
> +
> +;; A floating-point mode for which moves involving FPRs may need to be split.

Probably s/floating-point //, given that the iterator includes DI.

> +(define_mode_iterator SPLITF
> +  [(DF "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
> +   (DI "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
> +   (TF "TARGET_64BIT && TARGET_DOUBLE_FLOAT")])
> +
> +;; In GPR templates, a string like "mul.<d>" will expand to "mul" in the
> +;; 32-bit "mul.w" and "mul.d" in the 64-bit version.

Maybe:

;; In GPR templates, a string like "mul.<d>" will expand to "mul.w" in the
;; 32-bit version and "mul.d" in the 64-bit version.

> +(define_mode_attr d [(SI "w") (DI "d")])
> +
> +;; This attribute gives the length suffix for a load or store instruction.
> +;; The same suffixes work for zero and sign extensions.
> +(define_mode_attr size [(QI "b") (HI "h") (SI "w") (DI "d")])
> +(define_mode_attr SIZE [(QI "B") (HI "H") (SI "W") (DI "D")])
> +
> +;; This attributes gives the mode mask of a SHORT.

s/attributes/attribute/

> +(define_mode_attr mask [(QI "0x00ff") (HI "0xffff")])
> +
> +;; This attributes gives the size (bits) of a SHORT.

Maybe:

;; This attribute gives the number of bits in a SHORT minus one.

since it's 7,15 rather than 8,16.

> +(define_mode_attr qi_hi [(QI "7") (HI "15")])

Just a suggestion, but it might be worth renaming this.  qi_hi only
describes the range of inputs rather than what the attribute is.

> […]
> +(define_insn "*addsi3_extended"
> +  [(set (match_operand:DI 0 "register_operand" "=r,r")
> +	(sign_extend:DI
> +	     (plus:SI (match_operand:SI 1 "register_operand" "r,r")
> +		      (match_operand:SI 2 "arith_operand" "r,I"))))]
> +  "TARGET_64BIT"
> +  "add%i2.w\t%0,%1,%2"
> +  [(set_attr "alu_type" "add")
> +   (set_attr "mode" "SI")])
> +
> +(define_insn "*addsi3_extended2"
> +  [(set (match_operand:DI 0 "register_operand" "=r,r")
> +	(sign_extend:DI
> +	  (subreg:SI (plus:DI (match_operand:DI 1 "register_operand" "r,r")
> +			      (match_operand:DI 2 "arith_operand"    "r,I"))
> +		     0)))]
> +  "TARGET_64BIT"
> +  "add%i2.w\t%0,%1,%2"
> +  [(set_attr "alu_type" "add")
> +   (set_attr "mode" "SI")])

This is really a question for part 5, but it affects this part too:

I notice the port defines TRULY_NOOP_TRUNCATION to false for DI->SI,
like MIPS does.  Are you sure that's the right trade-off?  It had to
be defined that way for MIPS because 32-bit MIPS instructions gave
undefined results if the input operands weren't in sign-extended form.
For example a 32-bit add with 64-bit register contents:

   0x(ffffffff_)fffffffe
 + 0x(00000000_)00000001

was guaranteed to give -1 (sign-extended) but:

   0x(00000000_)fffffffe
 + 0x(00000000_)00000001

gave an undefined result.

Does Loongson have the same restriction, or do the 32-bit instructions
simply ignore the upper 32 bits?  If Loongson ignores the upper bits
then it would probably be better not to define TRULY_NOOP_TRUNCATION.

If you do that, the subreg pattern above shouldn't be needed; the
subreg should get folded away by target-independent code.

> +;;
> +;;  ....................
> +;;
> +;;	MULTIPLICATION
> +;;
> +;;  ....................
> +;;
> +
> +(define_insn "mul<mode>3"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(mult:ANYF (match_operand:ANYF 1 "register_operand" "f")
> +		   (match_operand:ANYF 2 "register_operand" "f")))]
> +  "TARGET_HARD_FLOAT"

Not a big deal, just noticed that some ANYF patterns (like this one)
add an explicit TARGET_HARD_FLOAT on top of the ANYF conditions:

(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
			    (DF "TARGET_DOUBLE_FLOAT")])

while other patterns don't.  Both are correct of course, but it might
be good to use one style throughout the file.  At first, when I saw
the pattern above, I thought the TARGET_HARD_FLOAT was missing from
the earlier patterns.

> […]
> +(define_insn "*div<mode>3"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(div:ANYF (match_operand:ANYF 1 "register_operand" "f")
> +		  (match_operand:ANYF 2 "register_operand" "f")))]
> +  "TARGET_HARD_FLOAT"
> +  "fdiv.<fmt>\t%0,%1,%2"
> +  [(set_attr "type" "fdiv")
> +   (set_attr "mode" "<UNITMODE>")
> +   (set_attr "insn_count" "1")])
> +
> +;; In 3A5000, the reciprocal operation is the same as the division operation.
> +
> +(define_insn "*recip<mode>3"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
> +		  (match_operand:ANYF 2 "register_operand" "f")))]
> +  "TARGET_HARD_FLOAT"
> +  "frecip.<fmt>\t%0,%2"
> +  [(set_attr "type" "frdiv")
> +   (set_attr "mode" "<UNITMODE>")
> +   (set_attr "insn_count" "1")])

Very minor, but these insn_counts seem redundant.  (MIPS needed them
due to the SB1 errata workaround.)

> […]
> +;; Floating point multiply accumulate instructions.
> +
> +;; a * b + c
> +(define_expand "fma<mode>4"
> +  [(set (match_operand:ANYF 0 "register_operand")
> +	(fma:ANYF (match_operand:ANYF 1 "register_operand")
> +		  (match_operand:ANYF 2 "register_operand")
> +		  (match_operand:ANYF 3 "register_operand")))]
> +  "TARGET_HARD_FLOAT")
> +
> +(define_insn "*fma<mode>4_madd4"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(fma:ANYF (match_operand:ANYF 1 "register_operand" "f")
> +		  (match_operand:ANYF 2 "register_operand" "f")
> +		  (match_operand:ANYF 3 "register_operand" "f")))]
> +  "TARGET_HARD_FLOAT"
> +  "fmadd.<fmt>\t%0,%1,%2,%3"
> +  [(set_attr "type" "fmadd")
> +   (set_attr "mode" "<UNITMODE>")])

This pair of patterns could be a single define_insn, like for fms.

> […]
> +;; Integer truncation patterns.  Truncating SImode values to smaller
> +;; modes is a no-op, as it is for most other GCC ports.  Truncating
> +;; DImode values to SImode is not a no-op for TARGET_64BIT since we
> +;; need to make sure that the lower 32 bits are properly sign-extended
> +;; (see TARGET_TRULY_NOOP_TRUNCATION).  Truncating DImode values into modes
> +;; smaller than SImode is equivalent to two separate truncations:
> +;;
> +;;			  A       B
> +;;    DI ---> HI  ==  DI ---> SI ---> HI
> +;;    DI ---> QI  ==  DI ---> SI ---> QI
> +;;
> +;; Step A needs a real instruction but step B does not.
> +
> +(define_insn "truncdi<mode>2"
> +  [(set (match_operand:SUBDI 0 "nonimmediate_operand" "=r,m,k")
> +	(truncate:SUBDI (match_operand:DI 1 "register_operand" "r,r,r")))]
> +  "TARGET_64BIT"
> +  "@
> +    slli.w\t%0,%1,0
> +    st.<size>\t%1,%0
> +    stx.<size>\t%1,%0"
> +  [(set_attr "move_type" "sll0,store,store")
> +   (set_attr "mode" "SI")])

Following on from the above, this pattern wouldn't be needed
if TRULY_NOOP_TRUNCATION can be left undefined.

> […]
> +;; Combiner patterns to optimize shift/truncate combinations.
> +
> +(define_insn "*ashr_trunc<mode>"
> +  [(set (match_operand:SUBDI 0 "register_operand" "=r")
> +	(truncate:SUBDI
> +	  (ashiftrt:DI (match_operand:DI 1 "register_operand" "r")
> +		       (match_operand:DI 2 "const_arith_operand" ""))))]
> +  "TARGET_64BIT && IN_RANGE (INTVAL (operands[2]), 32, 63)"
> +  "srai.d\t%0,%1,%2"
> +  [(set_attr "type" "shift")
> +   (set_attr "mode" "<MODE>")])
> +
> +(define_insn "*lshr32_trunc<mode>"
> +  [(set (match_operand:SUBDI 0 "register_operand" "=r")
> +	(truncate:SUBDI
> +	  (lshiftrt:DI (match_operand:DI 1 "register_operand" "r")
> +		       (const_int 32))))]
> +  "TARGET_64BIT"
> +  "srai.d\t%0,%1,32"
> +  [(set_attr "type" "shift")
> +   (set_attr "mode" "<MODE>")])

Same for these.

> +\f
> +;;
> +;;  ....................
> +;;
> +;;	ZERO EXTENSION
> +;;
> +;;  ....................
> +
> +(define_insn "zero_extendsidi2"
> +  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
> +	(zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,ZC,m,k")))]
> +  "TARGET_64BIT"
> +  "@
> +   bstrpick.d\t%0,%1,31,0
> +   ldptr.w\t%0,%1\n\tlu32i.d\t%0,0

FWIW, splitting this after reload would give more scheduling freedom,
but it's not necessary to do that for the submission.

> +   ld.wu\t%0,%1
> +   ldx.wu\t%0,%1"
> +  [(set_attr "move_type" "arith,load,load,load")
> +   (set_attr "mode" "DI")
> +   (set_attr "insn_count" "1,2,1,1")])
> +
> […]
> +;; Combiner patterns to optimize truncate/zero_extend combinations.
> +
> +(define_insn "*zero_extend<GPR:mode>_trunc<SHORT:mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=r")
> +	(zero_extend:GPR
> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
> +  "TARGET_64BIT"
> +  "bstrpick.w\t%0,%1,<SHORT:qi_hi>,0"
> +  [(set_attr "move_type" "pick_ins")
> +   (set_attr "mode" "<GPR:MODE>")])
> +
> +(define_insn "*zero_extendhi_truncqi"
> +  [(set (match_operand:HI 0 "register_operand" "=r")
> +	(zero_extend:HI
> +	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
> +  "TARGET_64BIT"
> +  "andi\t%0,%1,0xff"
> +  [(set_attr "alu_type" "and")
> +   (set_attr "mode" "HI")])
> +\f
> +;;
> +;;  ....................
> +;;
> +;;	SIGN EXTENSION
> +;;
> +;;  ....................
> +
> +;; Extension insns.
> +;; Those for integer source operand are ordered widest source type first.
> +
> +;; When TARGET_64BIT, all SImode integer should already be in sign-extended
> +;; form (see TARGET_TRULY_NOOP_TRUNCATION and truncdisi2).  We can therefore
> +;; get rid of register->register instructions if we constrain the source to
> +;; be in the same register as the destination.
> +;;
> +;; Only the pre-reload scheduler sees the type of the register alternatives;
> +;; we split them into nothing before the post-reload scheduler runs.
> +;; These alternatives therefore have type "move" in order to reflect
> +;; what happens if the two pre-reload operands cannot be tied, and are
> +;; instead allocated two separate GPRs.
> +(define_insn_and_split "extendsidi2"
> +  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
> +	(sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "0,ZC,m,k")))]
> +  "TARGET_64BIT"
> +  "@
> +   #
> +   ldptr.w\t%0,%1
> +   ld.w\t%0,%1
> +   ldx.w\t%0,%1"
> +  "&& reload_completed && register_operand (operands[1], VOIDmode)"
> +  [(const_int 0)]
> +{
> +  emit_note (NOTE_INSN_DELETED);
> +  DONE;
> +}
> +  [(set_attr "move_type" "move,load,load,load")
> +   (set_attr "mode" "DI")])

The three patterns above would also look different without
the definition of TRULY_NOOP_TRUNCATION.

> […]
> +(define_insn "*extenddi_truncate<mode>"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +	(sign_extend:DI
> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
> +  "TARGET_64BIT"
> +  "ext.w.<size>\t%0,%1"
> +  [(set_attr "move_type" "signext")
> +   (set_attr "mode" "DI")])
> +
> +(define_insn "*extendsi_truncate<mode>"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +	(sign_extend:SI
> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
> +  "TARGET_64BIT"
> +  "ext.w.<size>\t%0,%1"
> +  [(set_attr "move_type" "signext")
> +   (set_attr "mode" "SI")])
> +
> +(define_insn "*extendhi_truncateqi"
> +  [(set (match_operand:HI 0 "register_operand" "=r")
> +	(sign_extend:HI
> +	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
> +  "TARGET_64BIT"
> +  "ext.w.b\t%0,%1"
> +  [(set_attr "move_type" "signext")
> +   (set_attr "mode" "SI")])

Same for these.

> +
> +(define_insn "extendsfdf2"
> +  [(set (match_operand:DF 0 "register_operand" "=f")
> +	(float_extend:DF (match_operand:SF 1 "register_operand" "f")))]
> +  "TARGET_DOUBLE_FLOAT"
> +  "fcvt.d.s\t%0,%1"
> +  [(set_attr "type" "fcvt")
> +   (set_attr "cnv_mode"	"S2D")
> +   (set_attr "mode" "DF")])

Very minor, but it's probably better to use a space rather than a tab
after "cnv_mode", for consistency with the other attributes.  Same in
the rest of the file.

> +;; floating point value by converting to value to an unsigned integer

typo, maybe:

;; Convert a floating-point value to an unsigned integer.

> +
> +(define_expand "fixuns_truncdfsi2"
> +  [(set (match_operand:SI 0 "register_operand")
> +	(unsigned_fix:SI (match_operand:DF 1 "register_operand")))]
> +  "TARGET_DOUBLE_FLOAT"
> +{
> +  rtx reg1 = gen_reg_rtx (DFmode);
> +  rtx reg2 = gen_reg_rtx (DFmode);
> +  rtx reg3 = gen_reg_rtx (SImode);
> +  rtx_code_label *label1 = gen_label_rtx ();
> +  rtx_code_label *label2 = gen_label_rtx ();
> +  rtx test;
> +  REAL_VALUE_TYPE offset;
> +
> +  real_2expN (&offset, 31, DFmode);
> +
> +  if (reg1)		      /* Turn off complaints about unreached code.  */
> +    {

I think we can drop this workaround.  IIRC it was needed for the
MIPS IRIX compilers, or maybe it was some ancient version of GCC.

> +      loongarch_emit_move (reg1,
> +			   const_double_from_real_value (offset, DFmode));
> +      do_pending_stack_adjust ();
> +
> +      test = gen_rtx_GE (VOIDmode, operands[1], reg1);
> +      emit_jump_insn (gen_cbranchdf4 (test, operands[1], reg1, label1));
> +
> +      emit_insn (gen_fix_truncdfsi2 (operands[0], operands[1]));
> +      emit_jump_insn (gen_rtx_SET (pc_rtx,
> +				   gen_rtx_LABEL_REF (VOIDmode, label2)));
> +      emit_barrier ();
> +
> +      emit_label (label1);
> +      loongarch_emit_move (reg2, gen_rtx_MINUS (DFmode, operands[1], reg1));
> +      loongarch_emit_move (reg3, GEN_INT (trunc_int_for_mode
> +				     (BITMASK_HIGH, SImode)));
> +
> +      emit_insn (gen_fix_truncdfsi2 (operands[0], reg2));
> +      emit_insn (gen_iorsi3 (operands[0], operands[0], reg3));
> +
> +      emit_label (label2);
> +
> +      /* Allow REG_NOTES to be set on last insn (labels don't have enough
> +	 fields, and can't be used for REG_NOTES anyway).  */
> +      emit_use (stack_pointer_rtx);
> +      DONE;
> +    }
> +})
> […]
> +;; Allow combine to split complex const_int load sequences, using operand 2
> +;; to store the intermediate results.  See move_operand for details.
> +(define_split
> +  [(set (match_operand:GPR 0 "register_operand")
> +	(match_operand:GPR 1 "splittable_const_int_operand"))
> +   (clobber (match_operand:GPR 2 "register_operand"))]
> +  ""
> +  [(const_int 0)]
> +{
> +  loongarch_move_integer (operands[2], operands[0], INTVAL (operands[1]));
> +  DONE;
> +})

Does this define_split trigger?  I couldn't see an associated
define_insn that would provide the clobber.

> +(define_insn "*movsi_internal"
> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,w,*f,*f,*r,*m,*r,*z")
> +	(match_operand:SI 1 "move_operand" "r,Yd,w,rJ,*r*J,*m,*f,*f,*z,*r"))]
> +  "(register_operand (operands[0], SImode)
> +       || reg_or_0_operand (operands[1], SImode))"

Minor formatting nit: too much indentantion on the line above.  Same for
the other moves.

> […]
> +;; Conditional move instructions.
> +
> +(define_insn "*sel<code><GPR:mode>_using_<GPR2:mode>"
> +  [(set (match_operand:GPR 0 "register_operand" "=r,r")
> +	(if_then_else:GPR
> +	 (equality_op:GPR2 (match_operand:GPR2 1 "register_operand" "r,r")
> +			   (const_int 0))
> +	 (match_operand:GPR 2 "reg_or_0_operand" "r,J")
> +	 (match_operand:GPR 3 "reg_or_0_operand" "J,r")))]
> +  "register_operand (operands[2], <GPR:MODE>mode)
> +       != register_operand (operands[3], <GPR:MODE>mode)"

Same here.

> +  "@
> +   <sel>\t%0,%2,%1
> +   <selinv>\t%0,%3,%1"
> +  [(set_attr "type" "condmove")
> +   (set_attr "mode" "<GPR:MODE>")])
> +
> +;; sel.fmt copies the 3rd argument when the 1st is non-zero and the 2nd

s/sel.fmt/fsel/

> +;; argument if the 1st is zero.  This means operand 2 and 3 are
> +;; inverted in the instruction.
> +
> +(define_insn "*sel<mode>"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(if_then_else:ANYF
> +	 (ne:FCC (match_operand:FCC 1 "register_operand" "z")
> +		 (const_int 0))
> +	 (match_operand:ANYF 2 "reg_or_0_operand" "f")
> +	 (match_operand:ANYF 3 "reg_or_0_operand" "f")))]
> +  "TARGET_HARD_FLOAT"
> +  "fsel\t%0,%3,%2,%1"
> +  [(set_attr "type" "condmove")
> +   (set_attr "mode" "<ANYF:MODE>")])
> […]
> +(define_insn "lu52i_d"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +	(ior:DI
> +	  (and:DI (match_operand:DI 1 "register_operand" "r")
> +		  (match_operand 2 "lu52i_mask_operand"))
> +	  (match_operand 3 "const_lu52i_operand" "v")))]
> +    "TARGET_64BIT"
> +    "lu52i.d\t%0,%1,%X3>>52"
> +    [(set_attr "type" "arith")
> +     (set_attr "mode" "DI")])

Formatting nit: too much indentation from "TARGET_64BIT" onwards.

> +
> +;; Convert floating-point numbers to integers
> +(define_insn "frint_<fmt>"
> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
> +	(unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
> +		      UNSPEC_FRINT))]
> +  "TARGET_HARD_FLOAT"
> +  "frint.<fmt>\t%0,%1"
> +  [(set_attr "type" "fcvt")
> +   (set_attr "mode" "<MODE>")])
> +
> +;; LoongArch supports loading and storing a floating point register from
> +;; the sum of two general-purpose registers.  We use two versions for each of
> +;; these four instructions: one where the two general-purpose registers are
> +;; SImode, and one where they are DImode.  This is because general-purpose
> +;; registers will be in SImode when they hold 32-bit values, but,
> +;; since the 32-bit values are always sign extended, the f{ld/st}x.{s/d}
> +;; instructions will still work correctly.
> +
> +;; ??? Perhaps it would be better to support these instructions by
> +;; modifying TARGET_LEGITIMATE_ADDRESS_P and friends.  However, since
> +;; these instructions can only be used to load and store floating
> +;; point registers, that would probably cause trouble in reload.

Does this comment apply to Loongson?  It looks from:

  +;; Similarly for LoongArch indexed GPR loads and stores.
  +(define_mode_attr loadx [(QI "ldx.b")
  +			 (HI "ldx.h")
  +			 (SI "ldx.w")
  +			 (DI "ldx.d")])
  +(define_mode_attr storex [(QI "stx.b")
  +			  (HI "stx.h")
  +			  (SI "stx.w")
  +			  (DI "stx.d")])

like Loongson has a full set of indexed loads and stores, so supporting
indexed addresses in TARGET_LEGITIMATE_ADDRESS_P should work.

Also, the MIPS comment predates LRA, which is better than old reload
at handling irregular memory address requirements.  Accepting indexed
addresses in TARGET_LEGITIMATE_ADDRESS_P would allow ivopts to optimise
the code better.

> […]
> +;; Expand in-line code to clear the instruction cache between operand[0] and
> +;; operand[1].
> +(define_expand "clear_cache"
> +  [(match_operand 0 "pmode_register_operand")
> +   (match_operand 1 "pmode_register_operand")]
> +  ""
> +  "
> +{
> +  emit_insn (gen_ibar (const0_rtx));
> +  DONE;
> +}")

Minor nit: the quotes before { and after } are unnecessary.

> […]
> +(define_insn "asrtle_d"
> +	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
> +			      (match_operand:DI 1 "register_operand" "r")]
> +			      UNSPECV_ASRTLE_D)]
> +  "TARGET_64BIT"
> +  "asrtle.d\t%0,%1"
> +  [(set_attr "type" "load")
> +   (set_attr "mode" "DI")])
> +
> +(define_insn "asrtgt_d"
> +	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
> +			      (match_operand:DI 1 "register_operand" "r")]
> +			      UNSPECV_ASRTGT_D)]
> +  "TARGET_64BIT"
> +  "asrtgt.d\t%0,%1"
> +  [(set_attr "type" "load")
> +   (set_attr "mode" "DI")])

Formatting: the unspec_volatile pattern is indented too far in both cases.

> […]
> +;; The following templates were added to generate "bstrpick.d + alsl.d"
> +;; instruction pairs.
> +;; It is required that the values of const_immalsl_operand and
> +;; immediate_operand must have the following correspondence:
> +;;
> +;; (immediate_operand >> const_immalsl_operand) == 0xffffffff
> +
> +(define_insn "zero_extend_ashift1"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +	(and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
> +			   (match_operand 2 "const_immalsl_operand" ""))
> +		(match_operand 3 "immediate_operand" "")))]
> +  "TARGET_64BIT
> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
> +  [(set_attr "type" "arith")
> +   (set_attr "mode" "DI")
> +   (set_attr "insn_count" "2")])

Without the TRULY_NOOP_TRUNCATION definition, this pattern would be
redundant with…

> +(define_insn "zero_extend_ashift2"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +	(and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
> +			   (match_operand 2 "const_immalsl_operand" ""))
> +		(match_operand 3 "immediate_operand" "")))]
> +  "TARGET_64BIT
> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
> +  [(set_attr "type" "arith")
> +   (set_attr "mode" "DI")
> +   (set_attr "insn_count" "2")])

…this one.  I'm surprised it isn't already TBH.  Doesn't the subreg match:

  (match_operand:DI 1 "register_operand" "r")

?

> +
> +(define_insn "alsl_paired1"
> +  [(set (match_operand:DI 0 "register_operand" "=&r")
> +	(plus:DI (and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
> +				    (match_operand 2 "const_immalsl_operand" ""))
> +			 (match_operand 3 "immediate_operand" ""))
> +		 (match_operand:DI 4 "register_operand" "r")))]
> +  "TARGET_64BIT
> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,%4,%2"
> +  [(set_attr "type" "arith")
> +  (set_attr "mode" "DI")
> +  (set_attr "insn_count" "2")])
> +
> +(define_insn "alsl_paired2"
> +  [(set (match_operand:DI 0 "register_operand" "=&r")
> +	(plus:DI (match_operand:DI 1 "register_operand" "r")
> +		 (and:DI (ashift:DI (match_operand:DI 2 "register_operand" "r")
> +				    (match_operand 3 "const_immalsl_operand" ""))
> +			 (match_operand 4 "immediate_operand" ""))))]
> +  "TARGET_64BIT
> +   && ((INTVAL (operands[4]) >> INTVAL (operands[3])) == 0xffffffff)"
> +  "bstrpick.d\t%0,%2,31,0\n\talsl.d\t%0,%0,%1,%3"
> +  [(set_attr "type" "arith")
> +   (set_attr "mode" "DI")
> +   (set_attr "insn_count" "2")])

Same for this pair.

> […]
> +(define_expand "tablejump"
> +  [(set (pc)
> +	(match_operand 0 "register_operand"))
> +   (use (label_ref (match_operand 1 "")))]
> +  ""
> +{
> +  if (flag_pic)
> +      operands[0] = expand_simple_binop (Pmode, PLUS, operands[0],
> +					 gen_rtx_LABEL_REF (Pmode,
> +							    operands[1]),
> +					 NULL_RTX, 0, OPTAB_DIRECT);

Formatting nit: the last four lines should be indented by two spaces fewer.

> +  emit_jump_insn (PMODE_INSN (gen_tablejump, (operands[0], operands[1])));
> +  DONE;
> +})
> +(define_insn "sibcall_internal"
> +  [(call (mem:SI (match_operand 0 "call_insn_operand" "j,c,a,t,h"))
> +	 (match_operand 1 "" ""))]
> +  "SIBLING_CALL_P (insn)"
> +{
> +  switch (which_alternative)
> +    {
> +    case 0:
> +      return "jr\t%0";
> +    case 1:
> +      if (TARGET_CMODEL_LARGE)
> +	return "pcaddu18i\t$r12,(%%pcrel(%0+0x20000))>>18\n\t"
> +	       "jirl\t$r0,$r12,%%pcrel(%0+4)-(%%pcrel(%0+4+0x20000)>>18<<18)";
> +      else if (TARGET_CMODEL_EXTREME)
> +	return "la.local\t$r12,$r13,%0\n\tjr\t$r12";
> +      else
> +	return "b\t%0";
> +    case 2:
> +      if (TARGET_CMODEL_TINY_STATIC)
> +	return "b\t%0";
> +      else if (TARGET_CMODEL_EXTREME)
> +	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
> +      else
> +	return "la.global\t$r12,%0\n\tjr\t$r12";
> +    case 3:
> +      if (TARGET_CMODEL_EXTREME)
> +	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
> +      else
> +	return "la.global\t$r12,%0\n\tjr\t$r12";
> +    case 4:
> +      if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
> +	return "b\t%%plt(%0)";
> +      else if (TARGET_CMODEL_LARGE)
> +	return "pcaddu18i\t$r12,(%%plt(%0)+0x20000)>>18\n\t"
> +	       "jirl\t$r0,$r12,%%plt(%0)+4-((%%plt(%0)+(4+0x20000))>>18<<18)";
> +      else
> +	{
> +	  sorry ("cmodel extreme and tiny static not support plt");

“do not support PLTs”.

Can this be triggered by a certain combination of source code
and command-line options, or is it really an internal error?

> +	  return "";  /* GCC complains about may fall through.  */

sorry() isn't a fatal error, so the compiler will continue.
Returning "" is the right thing to do, but the comment makes it
sound unreachable.

Same comments for later sorry() + return pairs.

> +	}
> +    default:
> +      gcc_unreachable ();
> +    }
> +}
> +  [(set_attr "jirl" "indirect,direct,direct,direct,direct")])
> +
> +(define_expand "sibcall_value"
> +  [(parallel [(set (match_operand 0 "")
> +		   (call (match_operand 1 "")
> +			 (match_operand 2 "")))
> +	      (use (match_operand 3 ""))])]		;; next_arg_reg
> +  ""
> +{
> +  rtx target = loongarch_legitimize_call_address (XEXP (operands[1], 0));
> +
> + /*  Handle return values created by loongarch_pass_fpr_pair.  */

Formatting nit: should be two spaces before “/*” and only one after it.

> +  if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 2)
> +    {
> +      rtx arg1 = XEXP (XVECEXP (operands[0],0, 0), 0);
> +      rtx arg2 = XEXP (XVECEXP (operands[0],0, 1), 0);
> +
> +      emit_call_insn (gen_sibcall_value_multiple_internal (arg1, target,
> +							   operands[2],
> +							   arg2));
> +    }
> +   else
> +    {
> +      /*  Handle return values created by loongarch_return_fpr_single.  */

Should only be one space after “/*” here too.

> +      if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 1)
> +      operands[0] = XEXP (XVECEXP (operands[0], 0, 0), 0);

The last line should be indented by two spaces more.

Same three comments for call_value.

> […]
> +;;
> +;;  ....................
> +;;
> +;;	MISC.
> +;;
> +;;  ....................
> +;;
> +
> +(define_insn "nop"
> +  [(const_int 0)]
> +  ""
> +  "nop"
> +  [(set_attr "type"	"nop")
> +   (set_attr "mode"	"none")])

Formatting nit: most of the rest of the file uses a space rather than a
tab after the attribute name.  IMO a space looks nicer.  Same for some
later patterns.

> +
> +;; __builtin_loongarch_movfcsr2gr: move the FCSR into operand 0.
> +(define_insn "loongarch_movfcsr2gr"
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> +    (unspec_volatile:SI [(match_operand 1 "const_uimm5_operand")]
> +    UNSPECV_MOVFCSR2GR))]

Last two lines look under-indented.

> +  "TARGET_HARD_FLOAT"
> +  "movfcsr2gr\t%0,$r%1")
> +
> […]
> +(define_insn "stack_tie<mode>"
> +  [(set (mem:BLK (scratch))
> +	(unspec:BLK [(match_operand:GPR 0 "register_operand" "r")
> +		     (match_operand:GPR 1 "register_operand" "r")]
> +		    UNSPEC_TIE))]
> +  ""
> +  ""
> +  [(set_attr "length" "0")]
> +)

Using [(set_attr "type "ghost")] should be slightly better for
scheduling than setting the length directly.

> +
> +(define_insn "gpr_restore_return"
> +  [(return)
> +   (use (match_operand 0 "pmode_register_operand" ""))
> +   (const_int 0)]
> +  ""
> +  "")

Might be worth adding a comment here.  Why does this form of return expand
to no code?

> +\f
> +(define_split
> +  [(match_operand 0 "small_data_pattern")]
> +  "reload_completed"
> +  [(match_dup 0)]
> +  { operands[0] = loongarch_rewrite_small_data (operands[0]); })
> +
> +
> +;; Match paired HI/SI/SF/DFmode load/stores.
> +(define_insn "*join2_load_store<JOIN_MODE:mode>"
> +  [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand"
> +  "=r,f,m,m,r,ZC,r,k,f,k")
> +	(match_operand:JOIN_MODE 1 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))
> +   (set (match_operand:JOIN_MODE 2 "nonimmediate_operand"
> +   "=r,f,m,m,r,ZC,r,k,f,k")
> +	(match_operand:JOIN_MODE 3 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))]
> +  "reload_completed"
> +  {
> +    bool load_p = (which_alternative == 0 || which_alternative == 1);
> +    /* Reg-renaming pass reuses base register if it is dead after bonded loads.
> +       Hardware does not bond those loads, even when they are consecutive.
> +       However, order of the loads need to be checked for correctness.  */
> +    if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1]))
> +      {

I'm not sure I understand how these patterns work, but it looks like the
condition above is trying to work around a later change to the insn by
regrename, after peephole2 has checked loongarch_load_store_bonding_p.
If so, you should be able to avoid that by marking the destinations of
the loads as earlyclobbers, using "&r" instead of "r" for the first
alternative.  regrename should then preserve the conditions that
loongarch_load_store_bonding_p checked earlier.

Same for the other patterns.

> +	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
> +			 operands);
> +	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
> +			 &operands[2]);
> +      }
> +    else
> +      {
> +	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
> +			 &operands[2]);
> +	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
> +			 operands);
> +      }
> +    return "";
> +  }
> +  [(set_attr "move_type"
> +  "load,fpload,store,fpstore,load,store,load,store,fpload,fpstore")
> +   (set_attr "insn_count" "2,2,2,2,2,2,2,2,2,2")])
> […]
> +;; This is used for indexing into vectors, and hence only accepts const_int.
> +(define_predicate "const_0_or_1_operand"
> +  (and (match_code "const_int")
> +       (match_test "IN_RANGE (INTVAL (op), 0, 1)")))

It doesn't look like this is used.  It'd be good to check the other
predicates to see if any of them can be removed.  In particular…

> […]
> +(define_predicate "const_8_to_15_operand"
> +  (and (match_code "const_int")
> +       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
> +
> +(define_predicate "const_16_to_31_operand"
> +  (and (match_code "const_int")
> +       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))

…these two don't seem to do what their name suggests.

> […]
> +(define_predicate "muldiv_target_operand"
> +		(match_operand 0 "register_operand"))

This one also seems unused.  IMO using register_operand would be clearer.

> […]
> +(define_predicate "is_const_call_local_symbol"
> +  (and (match_operand 0 "const_call_insn_operand")
> +       (ior (match_test "loongarch_global_symbol_p (op) == 0")
> +       (match_test "loongarch_symbol_binds_local_p (op) != 0"))

The indentation looks misleading here: the last line is another
operand of the ior.

Thanks,
Richard

> +       (match_test "CONSTANT_P (op)")))
> […]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 04/12] LoongArch Port: Machine description files.
  2022-03-06 16:16   ` Richard Sandiford
@ 2022-03-07  3:59     ` 程璐璐
  2022-03-19  9:40     ` 程璐璐
  1 sibling, 0 replies; 34+ messages in thread
From: 程璐璐 @ 2022-03-07  3:59 UTC (permalink / raw)
  To: xuchenghua, gcc-patches, joseph, richard.sandiford

Hi,  Richard:

   Thanks for your review.
   We will revise it as soon as possible and submit it in the next version.

在 2022/3/7 上午12:16, Richard Sandiford 写道:

> Hi,
>
> Some comments below, but otherwise it looks good to me.
>
> xuchenghua@loongson.cn writes:
>> […]
>> +(define_memory_constraint "k"
>> +  "A memory operand whose address is formed by a base register and (optionally scaled)
>> +   index register."
>> +  (and (match_code "mem")
>> +       (not (match_test "loongarch_14bit_shifted_offset_address_p (XEXP (op, 0), mode)"))
>> +       (not (match_test "loongarch_12bit_offset_address_p (XEXP (op, 0), mode)"))))
> It's not really safe to test MEM addresses using only negative conditions.
> There needs to be a positive condition too, even if it's only:
>
>         (match_test "memory_address_addr_space_p (GET_MODE (op), XEXP (op, 0),
> 						 MEM_ADDR_SPACE (op))")))
>
> (from common.md).
>
>> […]
>> +(define_constraint "v"
>> +  "A nsigned 64-bit constant and low 44-bit is zero (for logic instructions)."
> typo
>
>> […]
>> +(define_memory_constraint "ZB"
>> +  "@internal
>> +  An address that is held in a general-purpose register.
>> +  The offset is zero"
>> +  (and (match_code "mem")
>> +       (match_test "GET_CODE (XEXP (op,0)) == REG")))
> It'd be good to use things like REG_P in new code.
>
> Formatting nit: should be a space after “op,”.
>
>> […]
>> +;; Pipeline descriptions.
>> +;;
>> +;; generic.md provides a fallback for processors without a specific
>> +;; pipeline description.  It is derived from the old define_function_unit
>> +;; version and uses the "alu" and "imuldiv" units declared below.
>> +;;
>> +;; Some of the processor-specific files are also derived from old
>> +;; define_function_unit descriptions and simply override the parts of
>> +;; generic.md that don't apply.  The other processor-specific files
>> +;; are self-contained.
> I don't think these last two paragraphs apply to the new code.
> The MIPS generic.md was converted from a much older pipeline
> description format.  The conversion meant the older processor
> descriptions weren't self-contained and relied on a mixture
> of processor-specific things (in their own file) and generic
> things (in this file).
>
> New processor descriptions should be self-contained as far
> as possible, in terms of not sharing cpu units with other
> processor descriptions.
>
>> +(define_automaton "alu,imuldiv")
>> +
>> +(define_cpu_unit "alu" "alu")
>> +(define_cpu_unit "imuldiv" "imuldiv")
>> +
>> +;; Ghost instructions produce no real code.
>> +;; They exist purely to express an effect on dataflow.
>> +(define_insn_reservation "ghost" 0
>> +  (eq_attr "type" "ghost")
>> +  "nothing")
>> +
>> +;; This file is derived from the old define_function_unit description.
>> +;; Each reservation can be overridden on a processor-by-processor basis.
> Same for this last comment.
>
> The "ghost" reservation is inherently sharable because it doesn't
> use any CPU units.  But for a new port, I think the other reservations
> in this file should be conditional on a particular -mtune.
>
>> […]
>> diff --git a/gcc/config/loongarch/la464.md b/gcc/config/loongarch/la464.md
>> new file mode 100644
>> index 00000000000..ae3808b51bb
>> --- /dev/null
>> +++ b/gcc/config/loongarch/la464.md
>> @@ -0,0 +1,132 @@
>> +;; Pipeline model for LoongArch LA464 cores.
>> +
>> +;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
>> +;; Contributed by Loongson Ltd.
>> +
>> +;; This file is part of GCC.
>> +;;
>> +;; GCC is free software; you can redistribute it and/or modify it
>> +;; under the terms of the GNU General Public License as published
>> +;; by the Free Software Foundation; either version 3, or (at your
>> +;; option) any later version.
>> +;;
>> +;; GCC is distributed in the hope that it will be useful, but WITHOUT
>> +;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
>> +;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
>> +;; License for more details.
>> +;;
>> +;; You should have received a copy of the GNU General Public License
>> +;; along with GCC; see the file COPYING3.  If not see
>> +;; <http://www.gnu.org/licenses/>.
>> +
>> +;; Uncomment the following line to output automata for debugging.
>> +;; (automata_option "v")
>> +
>> +;; Automaton for integer instructions.
>> +(define_automaton "la464_a_alu")
>> +
>> +;; Automaton for floating-point instructions.
>> +(define_automaton "la464_a_falu")
>> +
>> +;; Automaton for memory operations.
>> +(define_automaton "la464_a_mem")
>> +
>> +;; Describe the resources.
>> +
>> +(define_cpu_unit "la464_alu1" "la464_a_alu")
>> +(define_cpu_unit "la464_alu2" "la464_a_alu")
>> +(define_cpu_unit "la464_mem1" "la464_a_mem")
>> +(define_cpu_unit "la464_mem2" "la464_a_mem")
>> +(define_cpu_unit "la464_falu1" "la464_a_falu")
>> +(define_cpu_unit "la464_falu2" "la464_a_falu")
>> +
>> +;; Describe instruction reservations.
>> +
>> +(define_insn_reservation "la464_arith" 1
>> +  (and (match_test "TARGET_ARCH_LA464")
> Normally scheduling should be determined by -mtune (with a default
> -mtune chosen by -march when no explicit -mtune is given).  So I was
> surprised to see this testing TARGET_ARCH_* instead of TARGET_TUNE_*.
>
>> +       (eq_attr "type" "arith,clz,const,logical,
>> +			move,nop,shift,signext,slt"))
>> +  "la464_alu1 | la464_alu2")
>> […]
>> +;; Main data type used by the insn
>> +(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,OI,SF,DF,TF,FCC"
>> +  (const_string "unknown"))
> Do any patterns use mode==OI?  Reason for asking is that:
>
>> […]
>> +;; True if the main data type is four times of the size of a word.
>> +(define_attr "qword_mode" "no,yes"
>> +  (cond [(and (eq_attr "mode" "TI,TF")
>> +	      (not (match_test "TARGET_64BIT")))
>> +	 (const_string "yes")]
>> +	(const_string "no")))
>> +
>> +;; True if the main data type is eight times of the size of a word.
>> +(define_attr "oword_mode" "no,yes"
>> +  (cond [(and (eq_attr "mode" "OI")
>> +	      (not (match_test "TARGET_64BIT")))
>> +	 (const_string "yes")]
>> +	(const_string "no")))
> …it seemed inconsistent to treat OI is 8 words on 32-bit targets
> but not as 4 words (qword_mode) on 64-bit targets.
>
> It looks like oword_mode and OI might not be used, in which case
> it's probably easiest to remove them for now.
>
>> +(define_attr "compression" "none,all"
>> +  (const_string "none"))
> Does anything use this, or is it a holdover from MIPS?  I could see some
> definitions of the attribute later in the file, but there didn't seem
> to be any users.
>
>> […]
>> +;; This mode iterator allows the QI HI SI and DI extension patterns to be
> incomplete sentence.
>
>> +(define_mode_iterator QHWD [QI HI SI (DI "TARGET_64BIT")])
>> +
>> +;; Iterator for hardware-supported floating-point modes.
>> +(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
>> +			    (DF "TARGET_DOUBLE_FLOAT")])
>> +
>> +;; A floating-point mode for which moves involving FPRs may need to be split.
> Probably s/floating-point //, given that the iterator includes DI.
>
>> +(define_mode_iterator SPLITF
>> +  [(DF "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
>> +   (DI "!TARGET_64BIT && TARGET_DOUBLE_FLOAT")
>> +   (TF "TARGET_64BIT && TARGET_DOUBLE_FLOAT")])
>> +
>> +;; In GPR templates, a string like "mul.<d>" will expand to "mul" in the
>> +;; 32-bit "mul.w" and "mul.d" in the 64-bit version.
> Maybe:
>
> ;; In GPR templates, a string like "mul.<d>" will expand to "mul.w" in the
> ;; 32-bit version and "mul.d" in the 64-bit version.
>
>> +(define_mode_attr d [(SI "w") (DI "d")])
>> +
>> +;; This attribute gives the length suffix for a load or store instruction.
>> +;; The same suffixes work for zero and sign extensions.
>> +(define_mode_attr size [(QI "b") (HI "h") (SI "w") (DI "d")])
>> +(define_mode_attr SIZE [(QI "B") (HI "H") (SI "W") (DI "D")])
>> +
>> +;; This attributes gives the mode mask of a SHORT.
> s/attributes/attribute/
>
>> +(define_mode_attr mask [(QI "0x00ff") (HI "0xffff")])
>> +
>> +;; This attributes gives the size (bits) of a SHORT.
> Maybe:
>
> ;; This attribute gives the number of bits in a SHORT minus one.
>
> since it's 7,15 rather than 8,16.
>
>> +(define_mode_attr qi_hi [(QI "7") (HI "15")])
> Just a suggestion, but it might be worth renaming this.  qi_hi only
> describes the range of inputs rather than what the attribute is.
>
>> […]
>> +(define_insn "*addsi3_extended"
>> +  [(set (match_operand:DI 0 "register_operand" "=r,r")
>> +	(sign_extend:DI
>> +	     (plus:SI (match_operand:SI 1 "register_operand" "r,r")
>> +		      (match_operand:SI 2 "arith_operand" "r,I"))))]
>> +  "TARGET_64BIT"
>> +  "add%i2.w\t%0,%1,%2"
>> +  [(set_attr "alu_type" "add")
>> +   (set_attr "mode" "SI")])
>> +
>> +(define_insn "*addsi3_extended2"
>> +  [(set (match_operand:DI 0 "register_operand" "=r,r")
>> +	(sign_extend:DI
>> +	  (subreg:SI (plus:DI (match_operand:DI 1 "register_operand" "r,r")
>> +			      (match_operand:DI 2 "arith_operand"    "r,I"))
>> +		     0)))]
>> +  "TARGET_64BIT"
>> +  "add%i2.w\t%0,%1,%2"
>> +  [(set_attr "alu_type" "add")
>> +   (set_attr "mode" "SI")])
> This is really a question for part 5, but it affects this part too:
>
> I notice the port defines TRULY_NOOP_TRUNCATION to false for DI->SI,
> like MIPS does.  Are you sure that's the right trade-off?  It had to
> be defined that way for MIPS because 32-bit MIPS instructions gave
> undefined results if the input operands weren't in sign-extended form.
> For example a 32-bit add with 64-bit register contents:
>
>     0x(ffffffff_)fffffffe
>   + 0x(00000000_)00000001
>
> was guaranteed to give -1 (sign-extended) but:
>
>     0x(00000000_)fffffffe
>   + 0x(00000000_)00000001
>
> gave an undefined result.
>
> Does Loongson have the same restriction, or do the 32-bit instructions
> simply ignore the upper 32 bits?  If Loongson ignores the upper bits
> then it would probably be better not to define TRULY_NOOP_TRUNCATION.
>
> If you do that, the subreg pattern above shouldn't be needed; the
> subreg should get folded away by target-independent code.
>
>> +;;
>> +;;  ....................
>> +;;
>> +;;	MULTIPLICATION
>> +;;
>> +;;  ....................
>> +;;
>> +
>> +(define_insn "mul<mode>3"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(mult:ANYF (match_operand:ANYF 1 "register_operand" "f")
>> +		   (match_operand:ANYF 2 "register_operand" "f")))]
>> +  "TARGET_HARD_FLOAT"
> Not a big deal, just noticed that some ANYF patterns (like this one)
> add an explicit TARGET_HARD_FLOAT on top of the ANYF conditions:
>
> (define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
> 			    (DF "TARGET_DOUBLE_FLOAT")])
>
> while other patterns don't.  Both are correct of course, but it might
> be good to use one style throughout the file.  At first, when I saw
> the pattern above, I thought the TARGET_HARD_FLOAT was missing from
> the earlier patterns.
>
>> […]
>> +(define_insn "*div<mode>3"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(div:ANYF (match_operand:ANYF 1 "register_operand" "f")
>> +		  (match_operand:ANYF 2 "register_operand" "f")))]
>> +  "TARGET_HARD_FLOAT"
>> +  "fdiv.<fmt>\t%0,%1,%2"
>> +  [(set_attr "type" "fdiv")
>> +   (set_attr "mode" "<UNITMODE>")
>> +   (set_attr "insn_count" "1")])
>> +
>> +;; In 3A5000, the reciprocal operation is the same as the division operation.
>> +
>> +(define_insn "*recip<mode>3"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
>> +		  (match_operand:ANYF 2 "register_operand" "f")))]
>> +  "TARGET_HARD_FLOAT"
>> +  "frecip.<fmt>\t%0,%2"
>> +  [(set_attr "type" "frdiv")
>> +   (set_attr "mode" "<UNITMODE>")
>> +   (set_attr "insn_count" "1")])
> Very minor, but these insn_counts seem redundant.  (MIPS needed them
> due to the SB1 errata workaround.)
>
>> […]
>> +;; Floating point multiply accumulate instructions.
>> +
>> +;; a * b + c
>> +(define_expand "fma<mode>4"
>> +  [(set (match_operand:ANYF 0 "register_operand")
>> +	(fma:ANYF (match_operand:ANYF 1 "register_operand")
>> +		  (match_operand:ANYF 2 "register_operand")
>> +		  (match_operand:ANYF 3 "register_operand")))]
>> +  "TARGET_HARD_FLOAT")
>> +
>> +(define_insn "*fma<mode>4_madd4"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(fma:ANYF (match_operand:ANYF 1 "register_operand" "f")
>> +		  (match_operand:ANYF 2 "register_operand" "f")
>> +		  (match_operand:ANYF 3 "register_operand" "f")))]
>> +  "TARGET_HARD_FLOAT"
>> +  "fmadd.<fmt>\t%0,%1,%2,%3"
>> +  [(set_attr "type" "fmadd")
>> +   (set_attr "mode" "<UNITMODE>")])
> This pair of patterns could be a single define_insn, like for fms.
>
>> […]
>> +;; Integer truncation patterns.  Truncating SImode values to smaller
>> +;; modes is a no-op, as it is for most other GCC ports.  Truncating
>> +;; DImode values to SImode is not a no-op for TARGET_64BIT since we
>> +;; need to make sure that the lower 32 bits are properly sign-extended
>> +;; (see TARGET_TRULY_NOOP_TRUNCATION).  Truncating DImode values into modes
>> +;; smaller than SImode is equivalent to two separate truncations:
>> +;;
>> +;;			  A       B
>> +;;    DI ---> HI  ==  DI ---> SI ---> HI
>> +;;    DI ---> QI  ==  DI ---> SI ---> QI
>> +;;
>> +;; Step A needs a real instruction but step B does not.
>> +
>> +(define_insn "truncdi<mode>2"
>> +  [(set (match_operand:SUBDI 0 "nonimmediate_operand" "=r,m,k")
>> +	(truncate:SUBDI (match_operand:DI 1 "register_operand" "r,r,r")))]
>> +  "TARGET_64BIT"
>> +  "@
>> +    slli.w\t%0,%1,0
>> +    st.<size>\t%1,%0
>> +    stx.<size>\t%1,%0"
>> +  [(set_attr "move_type" "sll0,store,store")
>> +   (set_attr "mode" "SI")])
> Following on from the above, this pattern wouldn't be needed
> if TRULY_NOOP_TRUNCATION can be left undefined.
>
>> […]
>> +;; Combiner patterns to optimize shift/truncate combinations.
>> +
>> +(define_insn "*ashr_trunc<mode>"
>> +  [(set (match_operand:SUBDI 0 "register_operand" "=r")
>> +	(truncate:SUBDI
>> +	  (ashiftrt:DI (match_operand:DI 1 "register_operand" "r")
>> +		       (match_operand:DI 2 "const_arith_operand" ""))))]
>> +  "TARGET_64BIT && IN_RANGE (INTVAL (operands[2]), 32, 63)"
>> +  "srai.d\t%0,%1,%2"
>> +  [(set_attr "type" "shift")
>> +   (set_attr "mode" "<MODE>")])
>> +
>> +(define_insn "*lshr32_trunc<mode>"
>> +  [(set (match_operand:SUBDI 0 "register_operand" "=r")
>> +	(truncate:SUBDI
>> +	  (lshiftrt:DI (match_operand:DI 1 "register_operand" "r")
>> +		       (const_int 32))))]
>> +  "TARGET_64BIT"
>> +  "srai.d\t%0,%1,32"
>> +  [(set_attr "type" "shift")
>> +   (set_attr "mode" "<MODE>")])
> Same for these.
>
>> +\f
>> +;;
>> +;;  ....................
>> +;;
>> +;;	ZERO EXTENSION
>> +;;
>> +;;  ....................
>> +
>> +(define_insn "zero_extendsidi2"
>> +  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
>> +	(zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,ZC,m,k")))]
>> +  "TARGET_64BIT"
>> +  "@
>> +   bstrpick.d\t%0,%1,31,0
>> +   ldptr.w\t%0,%1\n\tlu32i.d\t%0,0
> FWIW, splitting this after reload would give more scheduling freedom,
> but it's not necessary to do that for the submission.
>
>> +   ld.wu\t%0,%1
>> +   ldx.wu\t%0,%1"
>> +  [(set_attr "move_type" "arith,load,load,load")
>> +   (set_attr "mode" "DI")
>> +   (set_attr "insn_count" "1,2,1,1")])
>> +
>> […]
>> +;; Combiner patterns to optimize truncate/zero_extend combinations.
>> +
>> +(define_insn "*zero_extend<GPR:mode>_trunc<SHORT:mode>"
>> +  [(set (match_operand:GPR 0 "register_operand" "=r")
>> +	(zero_extend:GPR
>> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
>> +  "TARGET_64BIT"
>> +  "bstrpick.w\t%0,%1,<SHORT:qi_hi>,0"
>> +  [(set_attr "move_type" "pick_ins")
>> +   (set_attr "mode" "<GPR:MODE>")])
>> +
>> +(define_insn "*zero_extendhi_truncqi"
>> +  [(set (match_operand:HI 0 "register_operand" "=r")
>> +	(zero_extend:HI
>> +	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
>> +  "TARGET_64BIT"
>> +  "andi\t%0,%1,0xff"
>> +  [(set_attr "alu_type" "and")
>> +   (set_attr "mode" "HI")])
>> +\f
>> +;;
>> +;;  ....................
>> +;;
>> +;;	SIGN EXTENSION
>> +;;
>> +;;  ....................
>> +
>> +;; Extension insns.
>> +;; Those for integer source operand are ordered widest source type first.
>> +
>> +;; When TARGET_64BIT, all SImode integer should already be in sign-extended
>> +;; form (see TARGET_TRULY_NOOP_TRUNCATION and truncdisi2).  We can therefore
>> +;; get rid of register->register instructions if we constrain the source to
>> +;; be in the same register as the destination.
>> +;;
>> +;; Only the pre-reload scheduler sees the type of the register alternatives;
>> +;; we split them into nothing before the post-reload scheduler runs.
>> +;; These alternatives therefore have type "move" in order to reflect
>> +;; what happens if the two pre-reload operands cannot be tied, and are
>> +;; instead allocated two separate GPRs.
>> +(define_insn_and_split "extendsidi2"
>> +  [(set (match_operand:DI 0 "register_operand" "=r,r,r,r")
>> +	(sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "0,ZC,m,k")))]
>> +  "TARGET_64BIT"
>> +  "@
>> +   #
>> +   ldptr.w\t%0,%1
>> +   ld.w\t%0,%1
>> +   ldx.w\t%0,%1"
>> +  "&& reload_completed && register_operand (operands[1], VOIDmode)"
>> +  [(const_int 0)]
>> +{
>> +  emit_note (NOTE_INSN_DELETED);
>> +  DONE;
>> +}
>> +  [(set_attr "move_type" "move,load,load,load")
>> +   (set_attr "mode" "DI")])
> The three patterns above would also look different without
> the definition of TRULY_NOOP_TRUNCATION.
>
>> […]
>> +(define_insn "*extenddi_truncate<mode>"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +	(sign_extend:DI
>> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
>> +  "TARGET_64BIT"
>> +  "ext.w.<size>\t%0,%1"
>> +  [(set_attr "move_type" "signext")
>> +   (set_attr "mode" "DI")])
>> +
>> +(define_insn "*extendsi_truncate<mode>"
>> +  [(set (match_operand:SI 0 "register_operand" "=r")
>> +	(sign_extend:SI
>> +	    (truncate:SHORT (match_operand:DI 1 "register_operand" "r"))))]
>> +  "TARGET_64BIT"
>> +  "ext.w.<size>\t%0,%1"
>> +  [(set_attr "move_type" "signext")
>> +   (set_attr "mode" "SI")])
>> +
>> +(define_insn "*extendhi_truncateqi"
>> +  [(set (match_operand:HI 0 "register_operand" "=r")
>> +	(sign_extend:HI
>> +	    (truncate:QI (match_operand:DI 1 "register_operand" "r"))))]
>> +  "TARGET_64BIT"
>> +  "ext.w.b\t%0,%1"
>> +  [(set_attr "move_type" "signext")
>> +   (set_attr "mode" "SI")])
> Same for these.
>
>> +
>> +(define_insn "extendsfdf2"
>> +  [(set (match_operand:DF 0 "register_operand" "=f")
>> +	(float_extend:DF (match_operand:SF 1 "register_operand" "f")))]
>> +  "TARGET_DOUBLE_FLOAT"
>> +  "fcvt.d.s\t%0,%1"
>> +  [(set_attr "type" "fcvt")
>> +   (set_attr "cnv_mode"	"S2D")
>> +   (set_attr "mode" "DF")])
> Very minor, but it's probably better to use a space rather than a tab
> after "cnv_mode", for consistency with the other attributes.  Same in
> the rest of the file.
>
>> +;; floating point value by converting to value to an unsigned integer
> typo, maybe:
>
> ;; Convert a floating-point value to an unsigned integer.
>
>> +
>> +(define_expand "fixuns_truncdfsi2"
>> +  [(set (match_operand:SI 0 "register_operand")
>> +	(unsigned_fix:SI (match_operand:DF 1 "register_operand")))]
>> +  "TARGET_DOUBLE_FLOAT"
>> +{
>> +  rtx reg1 = gen_reg_rtx (DFmode);
>> +  rtx reg2 = gen_reg_rtx (DFmode);
>> +  rtx reg3 = gen_reg_rtx (SImode);
>> +  rtx_code_label *label1 = gen_label_rtx ();
>> +  rtx_code_label *label2 = gen_label_rtx ();
>> +  rtx test;
>> +  REAL_VALUE_TYPE offset;
>> +
>> +  real_2expN (&offset, 31, DFmode);
>> +
>> +  if (reg1)		      /* Turn off complaints about unreached code.  */
>> +    {
> I think we can drop this workaround.  IIRC it was needed for the
> MIPS IRIX compilers, or maybe it was some ancient version of GCC.
>
>> +      loongarch_emit_move (reg1,
>> +			   const_double_from_real_value (offset, DFmode));
>> +      do_pending_stack_adjust ();
>> +
>> +      test = gen_rtx_GE (VOIDmode, operands[1], reg1);
>> +      emit_jump_insn (gen_cbranchdf4 (test, operands[1], reg1, label1));
>> +
>> +      emit_insn (gen_fix_truncdfsi2 (operands[0], operands[1]));
>> +      emit_jump_insn (gen_rtx_SET (pc_rtx,
>> +				   gen_rtx_LABEL_REF (VOIDmode, label2)));
>> +      emit_barrier ();
>> +
>> +      emit_label (label1);
>> +      loongarch_emit_move (reg2, gen_rtx_MINUS (DFmode, operands[1], reg1));
>> +      loongarch_emit_move (reg3, GEN_INT (trunc_int_for_mode
>> +				     (BITMASK_HIGH, SImode)));
>> +
>> +      emit_insn (gen_fix_truncdfsi2 (operands[0], reg2));
>> +      emit_insn (gen_iorsi3 (operands[0], operands[0], reg3));
>> +
>> +      emit_label (label2);
>> +
>> +      /* Allow REG_NOTES to be set on last insn (labels don't have enough
>> +	 fields, and can't be used for REG_NOTES anyway).  */
>> +      emit_use (stack_pointer_rtx);
>> +      DONE;
>> +    }
>> +})
>> […]
>> +;; Allow combine to split complex const_int load sequences, using operand 2
>> +;; to store the intermediate results.  See move_operand for details.
>> +(define_split
>> +  [(set (match_operand:GPR 0 "register_operand")
>> +	(match_operand:GPR 1 "splittable_const_int_operand"))
>> +   (clobber (match_operand:GPR 2 "register_operand"))]
>> +  ""
>> +  [(const_int 0)]
>> +{
>> +  loongarch_move_integer (operands[2], operands[0], INTVAL (operands[1]));
>> +  DONE;
>> +})
> Does this define_split trigger?  I couldn't see an associated
> define_insn that would provide the clobber.
>
>> +(define_insn "*movsi_internal"
>> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,w,*f,*f,*r,*m,*r,*z")
>> +	(match_operand:SI 1 "move_operand" "r,Yd,w,rJ,*r*J,*m,*f,*f,*z,*r"))]
>> +  "(register_operand (operands[0], SImode)
>> +       || reg_or_0_operand (operands[1], SImode))"
> Minor formatting nit: too much indentantion on the line above.  Same for
> the other moves.
>
>> […]
>> +;; Conditional move instructions.
>> +
>> +(define_insn "*sel<code><GPR:mode>_using_<GPR2:mode>"
>> +  [(set (match_operand:GPR 0 "register_operand" "=r,r")
>> +	(if_then_else:GPR
>> +	 (equality_op:GPR2 (match_operand:GPR2 1 "register_operand" "r,r")
>> +			   (const_int 0))
>> +	 (match_operand:GPR 2 "reg_or_0_operand" "r,J")
>> +	 (match_operand:GPR 3 "reg_or_0_operand" "J,r")))]
>> +  "register_operand (operands[2], <GPR:MODE>mode)
>> +       != register_operand (operands[3], <GPR:MODE>mode)"
> Same here.
>
>> +  "@
>> +   <sel>\t%0,%2,%1
>> +   <selinv>\t%0,%3,%1"
>> +  [(set_attr "type" "condmove")
>> +   (set_attr "mode" "<GPR:MODE>")])
>> +
>> +;; sel.fmt copies the 3rd argument when the 1st is non-zero and the 2nd
> s/sel.fmt/fsel/
>
>> +;; argument if the 1st is zero.  This means operand 2 and 3 are
>> +;; inverted in the instruction.
>> +
>> +(define_insn "*sel<mode>"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(if_then_else:ANYF
>> +	 (ne:FCC (match_operand:FCC 1 "register_operand" "z")
>> +		 (const_int 0))
>> +	 (match_operand:ANYF 2 "reg_or_0_operand" "f")
>> +	 (match_operand:ANYF 3 "reg_or_0_operand" "f")))]
>> +  "TARGET_HARD_FLOAT"
>> +  "fsel\t%0,%3,%2,%1"
>> +  [(set_attr "type" "condmove")
>> +   (set_attr "mode" "<ANYF:MODE>")])
>> […]
>> +(define_insn "lu52i_d"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +	(ior:DI
>> +	  (and:DI (match_operand:DI 1 "register_operand" "r")
>> +		  (match_operand 2 "lu52i_mask_operand"))
>> +	  (match_operand 3 "const_lu52i_operand" "v")))]
>> +    "TARGET_64BIT"
>> +    "lu52i.d\t%0,%1,%X3>>52"
>> +    [(set_attr "type" "arith")
>> +     (set_attr "mode" "DI")])
> Formatting nit: too much indentation from "TARGET_64BIT" onwards.
>
>> +
>> +;; Convert floating-point numbers to integers
>> +(define_insn "frint_<fmt>"
>> +  [(set (match_operand:ANYF 0 "register_operand" "=f")
>> +	(unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
>> +		      UNSPEC_FRINT))]
>> +  "TARGET_HARD_FLOAT"
>> +  "frint.<fmt>\t%0,%1"
>> +  [(set_attr "type" "fcvt")
>> +   (set_attr "mode" "<MODE>")])
>> +
>> +;; LoongArch supports loading and storing a floating point register from
>> +;; the sum of two general-purpose registers.  We use two versions for each of
>> +;; these four instructions: one where the two general-purpose registers are
>> +;; SImode, and one where they are DImode.  This is because general-purpose
>> +;; registers will be in SImode when they hold 32-bit values, but,
>> +;; since the 32-bit values are always sign extended, the f{ld/st}x.{s/d}
>> +;; instructions will still work correctly.
>> +
>> +;; ??? Perhaps it would be better to support these instructions by
>> +;; modifying TARGET_LEGITIMATE_ADDRESS_P and friends.  However, since
>> +;; these instructions can only be used to load and store floating
>> +;; point registers, that would probably cause trouble in reload.
> Does this comment apply to Loongson?  It looks from:
>
>    +;; Similarly for LoongArch indexed GPR loads and stores.
>    +(define_mode_attr loadx [(QI "ldx.b")
>    +			 (HI "ldx.h")
>    +			 (SI "ldx.w")
>    +			 (DI "ldx.d")])
>    +(define_mode_attr storex [(QI "stx.b")
>    +			  (HI "stx.h")
>    +			  (SI "stx.w")
>    +			  (DI "stx.d")])
>
> like Loongson has a full set of indexed loads and stores, so supporting
> indexed addresses in TARGET_LEGITIMATE_ADDRESS_P should work.
>
> Also, the MIPS comment predates LRA, which is better than old reload
> at handling irregular memory address requirements.  Accepting indexed
> addresses in TARGET_LEGITIMATE_ADDRESS_P would allow ivopts to optimise
> the code better.
>
>> […]
>> +;; Expand in-line code to clear the instruction cache between operand[0] and
>> +;; operand[1].
>> +(define_expand "clear_cache"
>> +  [(match_operand 0 "pmode_register_operand")
>> +   (match_operand 1 "pmode_register_operand")]
>> +  ""
>> +  "
>> +{
>> +  emit_insn (gen_ibar (const0_rtx));
>> +  DONE;
>> +}")
> Minor nit: the quotes before { and after } are unnecessary.
>
>> […]
>> +(define_insn "asrtle_d"
>> +	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
>> +			      (match_operand:DI 1 "register_operand" "r")]
>> +			      UNSPECV_ASRTLE_D)]
>> +  "TARGET_64BIT"
>> +  "asrtle.d\t%0,%1"
>> +  [(set_attr "type" "load")
>> +   (set_attr "mode" "DI")])
>> +
>> +(define_insn "asrtgt_d"
>> +	[(unspec_volatile:DI [(match_operand:DI 0 "register_operand" "r")
>> +			      (match_operand:DI 1 "register_operand" "r")]
>> +			      UNSPECV_ASRTGT_D)]
>> +  "TARGET_64BIT"
>> +  "asrtgt.d\t%0,%1"
>> +  [(set_attr "type" "load")
>> +   (set_attr "mode" "DI")])
> Formatting: the unspec_volatile pattern is indented too far in both cases.
>
>> […]
>> +;; The following templates were added to generate "bstrpick.d + alsl.d"
>> +;; instruction pairs.
>> +;; It is required that the values of const_immalsl_operand and
>> +;; immediate_operand must have the following correspondence:
>> +;;
>> +;; (immediate_operand >> const_immalsl_operand) == 0xffffffff
>> +
>> +(define_insn "zero_extend_ashift1"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +	(and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
>> +			   (match_operand 2 "const_immalsl_operand" ""))
>> +		(match_operand 3 "immediate_operand" "")))]
>> +  "TARGET_64BIT
>> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
>> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
>> +  [(set_attr "type" "arith")
>> +   (set_attr "mode" "DI")
>> +   (set_attr "insn_count" "2")])
> Without the TRULY_NOOP_TRUNCATION definition, this pattern would be
> redundant with…
>
>> +(define_insn "zero_extend_ashift2"
>> +  [(set (match_operand:DI 0 "register_operand" "=r")
>> +	(and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
>> +			   (match_operand 2 "const_immalsl_operand" ""))
>> +		(match_operand 3 "immediate_operand" "")))]
>> +  "TARGET_64BIT
>> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
>> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,$r0,%2"
>> +  [(set_attr "type" "arith")
>> +   (set_attr "mode" "DI")
>> +   (set_attr "insn_count" "2")])
> …this one.  I'm surprised it isn't already TBH.  Doesn't the subreg match:
>
>    (match_operand:DI 1 "register_operand" "r")
>
> ?
>
>> +
>> +(define_insn "alsl_paired1"
>> +  [(set (match_operand:DI 0 "register_operand" "=&r")
>> +	(plus:DI (and:DI (ashift:DI (subreg:DI (match_operand:SI 1 "register_operand" "r") 0)
>> +				    (match_operand 2 "const_immalsl_operand" ""))
>> +			 (match_operand 3 "immediate_operand" ""))
>> +		 (match_operand:DI 4 "register_operand" "r")))]
>> +  "TARGET_64BIT
>> +   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0xffffffff)"
>> +  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,%4,%2"
>> +  [(set_attr "type" "arith")
>> +  (set_attr "mode" "DI")
>> +  (set_attr "insn_count" "2")])
>> +
>> +(define_insn "alsl_paired2"
>> +  [(set (match_operand:DI 0 "register_operand" "=&r")
>> +	(plus:DI (match_operand:DI 1 "register_operand" "r")
>> +		 (and:DI (ashift:DI (match_operand:DI 2 "register_operand" "r")
>> +				    (match_operand 3 "const_immalsl_operand" ""))
>> +			 (match_operand 4 "immediate_operand" ""))))]
>> +  "TARGET_64BIT
>> +   && ((INTVAL (operands[4]) >> INTVAL (operands[3])) == 0xffffffff)"
>> +  "bstrpick.d\t%0,%2,31,0\n\talsl.d\t%0,%0,%1,%3"
>> +  [(set_attr "type" "arith")
>> +   (set_attr "mode" "DI")
>> +   (set_attr "insn_count" "2")])
> Same for this pair.
>
>> […]
>> +(define_expand "tablejump"
>> +  [(set (pc)
>> +	(match_operand 0 "register_operand"))
>> +   (use (label_ref (match_operand 1 "")))]
>> +  ""
>> +{
>> +  if (flag_pic)
>> +      operands[0] = expand_simple_binop (Pmode, PLUS, operands[0],
>> +					 gen_rtx_LABEL_REF (Pmode,
>> +							    operands[1]),
>> +					 NULL_RTX, 0, OPTAB_DIRECT);
> Formatting nit: the last four lines should be indented by two spaces fewer.
>
>> +  emit_jump_insn (PMODE_INSN (gen_tablejump, (operands[0], operands[1])));
>> +  DONE;
>> +})
>> +(define_insn "sibcall_internal"
>> +  [(call (mem:SI (match_operand 0 "call_insn_operand" "j,c,a,t,h"))
>> +	 (match_operand 1 "" ""))]
>> +  "SIBLING_CALL_P (insn)"
>> +{
>> +  switch (which_alternative)
>> +    {
>> +    case 0:
>> +      return "jr\t%0";
>> +    case 1:
>> +      if (TARGET_CMODEL_LARGE)
>> +	return "pcaddu18i\t$r12,(%%pcrel(%0+0x20000))>>18\n\t"
>> +	       "jirl\t$r0,$r12,%%pcrel(%0+4)-(%%pcrel(%0+4+0x20000)>>18<<18)";
>> +      else if (TARGET_CMODEL_EXTREME)
>> +	return "la.local\t$r12,$r13,%0\n\tjr\t$r12";
>> +      else
>> +	return "b\t%0";
>> +    case 2:
>> +      if (TARGET_CMODEL_TINY_STATIC)
>> +	return "b\t%0";
>> +      else if (TARGET_CMODEL_EXTREME)
>> +	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
>> +      else
>> +	return "la.global\t$r12,%0\n\tjr\t$r12";
>> +    case 3:
>> +      if (TARGET_CMODEL_EXTREME)
>> +	return "la.global\t$r12,$r13,%0\n\tjr\t$r12";
>> +      else
>> +	return "la.global\t$r12,%0\n\tjr\t$r12";
>> +    case 4:
>> +      if (TARGET_CMODEL_NORMAL || TARGET_CMODEL_TINY)
>> +	return "b\t%%plt(%0)";
>> +      else if (TARGET_CMODEL_LARGE)
>> +	return "pcaddu18i\t$r12,(%%plt(%0)+0x20000)>>18\n\t"
>> +	       "jirl\t$r0,$r12,%%plt(%0)+4-((%%plt(%0)+(4+0x20000))>>18<<18)";
>> +      else
>> +	{
>> +	  sorry ("cmodel extreme and tiny static not support plt");
> “do not support PLTs”.
>
> Can this be triggered by a certain combination of source code
> and command-line options, or is it really an internal error?
>
>> +	  return "";  /* GCC complains about may fall through.  */
> sorry() isn't a fatal error, so the compiler will continue.
> Returning "" is the right thing to do, but the comment makes it
> sound unreachable.
>
> Same comments for later sorry() + return pairs.
>
>> +	}
>> +    default:
>> +      gcc_unreachable ();
>> +    }
>> +}
>> +  [(set_attr "jirl" "indirect,direct,direct,direct,direct")])
>> +
>> +(define_expand "sibcall_value"
>> +  [(parallel [(set (match_operand 0 "")
>> +		   (call (match_operand 1 "")
>> +			 (match_operand 2 "")))
>> +	      (use (match_operand 3 ""))])]		;; next_arg_reg
>> +  ""
>> +{
>> +  rtx target = loongarch_legitimize_call_address (XEXP (operands[1], 0));
>> +
>> + /*  Handle return values created by loongarch_pass_fpr_pair.  */
> Formatting nit: should be two spaces before “/*” and only one after it.
>
>> +  if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 2)
>> +    {
>> +      rtx arg1 = XEXP (XVECEXP (operands[0],0, 0), 0);
>> +      rtx arg2 = XEXP (XVECEXP (operands[0],0, 1), 0);
>> +
>> +      emit_call_insn (gen_sibcall_value_multiple_internal (arg1, target,
>> +							   operands[2],
>> +							   arg2));
>> +    }
>> +   else
>> +    {
>> +      /*  Handle return values created by loongarch_return_fpr_single.  */
> Should only be one space after “/*” here too.
>
>> +      if (GET_CODE (operands[0]) == PARALLEL && XVECLEN (operands[0], 0) == 1)
>> +      operands[0] = XEXP (XVECEXP (operands[0], 0, 0), 0);
> The last line should be indented by two spaces more.
>
> Same three comments for call_value.
>
>> […]
>> +;;
>> +;;  ....................
>> +;;
>> +;;	MISC.
>> +;;
>> +;;  ....................
>> +;;
>> +
>> +(define_insn "nop"
>> +  [(const_int 0)]
>> +  ""
>> +  "nop"
>> +  [(set_attr "type"	"nop")
>> +   (set_attr "mode"	"none")])
> Formatting nit: most of the rest of the file uses a space rather than a
> tab after the attribute name.  IMO a space looks nicer.  Same for some
> later patterns.
>
>> +
>> +;; __builtin_loongarch_movfcsr2gr: move the FCSR into operand 0.
>> +(define_insn "loongarch_movfcsr2gr"
>> +  [(set (match_operand:SI 0 "register_operand" "=r")
>> +    (unspec_volatile:SI [(match_operand 1 "const_uimm5_operand")]
>> +    UNSPECV_MOVFCSR2GR))]
> Last two lines look under-indented.
>
>> +  "TARGET_HARD_FLOAT"
>> +  "movfcsr2gr\t%0,$r%1")
>> +
>> […]
>> +(define_insn "stack_tie<mode>"
>> +  [(set (mem:BLK (scratch))
>> +	(unspec:BLK [(match_operand:GPR 0 "register_operand" "r")
>> +		     (match_operand:GPR 1 "register_operand" "r")]
>> +		    UNSPEC_TIE))]
>> +  ""
>> +  ""
>> +  [(set_attr "length" "0")]
>> +)
> Using [(set_attr "type "ghost")] should be slightly better for
> scheduling than setting the length directly.
>
>> +
>> +(define_insn "gpr_restore_return"
>> +  [(return)
>> +   (use (match_operand 0 "pmode_register_operand" ""))
>> +   (const_int 0)]
>> +  ""
>> +  "")
> Might be worth adding a comment here.  Why does this form of return expand
> to no code?
>
>> +\f
>> +(define_split
>> +  [(match_operand 0 "small_data_pattern")]
>> +  "reload_completed"
>> +  [(match_dup 0)]
>> +  { operands[0] = loongarch_rewrite_small_data (operands[0]); })
>> +
>> +
>> +;; Match paired HI/SI/SF/DFmode load/stores.
>> +(define_insn "*join2_load_store<JOIN_MODE:mode>"
>> +  [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand"
>> +  "=r,f,m,m,r,ZC,r,k,f,k")
>> +	(match_operand:JOIN_MODE 1 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))
>> +   (set (match_operand:JOIN_MODE 2 "nonimmediate_operand"
>> +   "=r,f,m,m,r,ZC,r,k,f,k")
>> +	(match_operand:JOIN_MODE 3 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))]
>> +  "reload_completed"
>> +  {
>> +    bool load_p = (which_alternative == 0 || which_alternative == 1);
>> +    /* Reg-renaming pass reuses base register if it is dead after bonded loads.
>> +       Hardware does not bond those loads, even when they are consecutive.
>> +       However, order of the loads need to be checked for correctness.  */
>> +    if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1]))
>> +      {
> I'm not sure I understand how these patterns work, but it looks like the
> condition above is trying to work around a later change to the insn by
> regrename, after peephole2 has checked loongarch_load_store_bonding_p.
> If so, you should be able to avoid that by marking the destinations of
> the loads as earlyclobbers, using "&r" instead of "r" for the first
> alternative.  regrename should then preserve the conditions that
> loongarch_load_store_bonding_p checked earlier.
>
> Same for the other patterns.
>
>> +	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
>> +			 operands);
>> +	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
>> +			 &operands[2]);
>> +      }
>> +    else
>> +      {
>> +	output_asm_insn (loongarch_output_move (operands[2], operands[3]),
>> +			 &operands[2]);
>> +	output_asm_insn (loongarch_output_move (operands[0], operands[1]),
>> +			 operands);
>> +      }
>> +    return "";
>> +  }
>> +  [(set_attr "move_type"
>> +  "load,fpload,store,fpstore,load,store,load,store,fpload,fpstore")
>> +   (set_attr "insn_count" "2,2,2,2,2,2,2,2,2,2")])
>> […]
>> +;; This is used for indexing into vectors, and hence only accepts const_int.
>> +(define_predicate "const_0_or_1_operand"
>> +  (and (match_code "const_int")
>> +       (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
> It doesn't look like this is used.  It'd be good to check the other
> predicates to see if any of them can be removed.  In particular…
>
>> […]
>> +(define_predicate "const_8_to_15_operand"
>> +  (and (match_code "const_int")
>> +       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
>> +
>> +(define_predicate "const_16_to_31_operand"
>> +  (and (match_code "const_int")
>> +       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
> …these two don't seem to do what their name suggests.
>
>> […]
>> +(define_predicate "muldiv_target_operand"
>> +		(match_operand 0 "register_operand"))
> This one also seems unused.  IMO using register_operand would be clearer.
>
>> […]
>> +(define_predicate "is_const_call_local_symbol"
>> +  (and (match_operand 0 "const_call_insn_operand")
>> +       (ior (match_test "loongarch_global_symbol_p (op) == 0")
>> +       (match_test "loongarch_symbol_binds_local_p (op) != 0"))
> The indentation looks misleading here: the last line is another
> operand of the ior.
>
> Thanks,
> Richard
>
>> +       (match_test "CONSTANT_P (op)")))
>> […]


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 05/12] LoongArch Port: Machine description C files and .h files.
  2022-03-04  7:18 ` [PATCH v8 05/12] LoongArch Port: Machine description C files and .h files xuchenghua
@ 2022-03-07 18:17   ` Richard Sandiford
  2022-03-19  9:35     ` 程璐璐
  0 siblings, 1 reply; 34+ messages in thread
From: Richard Sandiford @ 2022-03-07 18:17 UTC (permalink / raw)
  To: xuchenghua; +Cc: gcc-patches, chenglulu, joseph

Hi,

Some comments below, but otherwise it looks good to me.

A few of the comments are about removing hook or macro definitions
that are the same as the default.  Doing that helps people who want
to update a hook interface in future, since there are then fewer
places to adjust.

xuchenghua@loongson.cn writes:
> […]
> +/* Describes how a symbol is used.
> +
> +   SYMBOL_CONTEXT_CALL
> +       The symbol is used as the target of a call instruction.
> +
> +   SYMBOL_CONTEXT_LEA
> +       The symbol is used in a load-address operation.
> +
> +   SYMBOL_CONTEXT_MEM
> +       The symbol is used as the address in a MEM.  */
> +enum loongarch_symbol_context {
> +  SYMBOL_CONTEXT_CALL,
> +  SYMBOL_CONTEXT_LEA,
> +  SYMBOL_CONTEXT_MEM
> +};

It looks like this is unused: loongarch_classify_symbol takes an
argument of this type, but ignores it.

> […]
> +/* Classifies a type of call.
> +
> +   LARCH_CALL_NORMAL
> +	A normal call or call_value pattern.
> +
> +   LARCH_CALL_SIBCALL
> +	A sibcall or sibcall_value pattern.
> +
> +   LARCH_CALL_EPILOGUE
> +	A call inserted in the epilogue.  */
> +enum loongarch_call_type {
> +  LARCH_CALL_NORMAL,
> +  LARCH_CALL_SIBCALL,
> +  LARCH_CALL_EPILOGUE
> +};

This also looks unused.

> +
> +/* Controls the conditions under which certain instructions are split.
> +
> +   SPLIT_IF_NECESSARY
> +	Only perform splits that are necessary for correctness
> +	(because no unsplit version exists).
> +
> +   SPLIT_FOR_SPEED
> +	Perform splits that are necessary for correctness or
> +	beneficial for code speed.
> +
> +   SPLIT_FOR_SIZE
> +	Perform splits that are necessary for correctness or
> +	beneficial for code size.  */
> +enum loongarch_split_type {
> +  SPLIT_IF_NECESSARY,
> +  SPLIT_FOR_SPEED,
> +  SPLIT_FOR_SIZE
> +};

It looks like this is also unused: loongarch_split_move_p takes an
argument of this type, but ignores it.

I think it'd be better to remove these three enums for now and add them
back later if they become useful.

> […]
> +/* RTX costs of various operations on the different architectures.  */
> +struct loongarch_rtx_cost_data
> +{
> +  unsigned short fp_add;
> +  unsigned short fp_mult_sf;
> +  unsigned short fp_mult_df;
> +  unsigned short fp_div_sf;
> +  unsigned short fp_div_df;
> +  unsigned short int_mult_si;
> +  unsigned short int_mult_di;
> +  unsigned short int_div_si;
> +  unsigned short int_div_di;
> +  unsigned short branch_cost;
> +  unsigned short memory_latency;
> +};
> +
> +/* Default RTX cost initializer.  */
> +#define COSTS_N_INSNS(N) ((N) * 4)
> +#define DEFAULT_COSTS				\
> +    .fp_add		= COSTS_N_INSNS (1),	\
> +    .fp_mult_sf		= COSTS_N_INSNS (2),	\
> +    .fp_mult_df		= COSTS_N_INSNS (2),	\
> +    .fp_div_sf		= COSTS_N_INSNS (4),	\
> +    .fp_div_df		= COSTS_N_INSNS (4),	\
> +    .int_mult_si	= COSTS_N_INSNS (1),	\
> +    .int_mult_di	= COSTS_N_INSNS (1),	\
> +    .int_div_si		= COSTS_N_INSNS (1),	\
> +    .int_div_di		= COSTS_N_INSNS (1),	\
> +    .branch_cost	= 2,			\
> +    .memory_latency	= 4

Unfortunately we need to stay within standard C++(11), so the initialisers
can't use “.name = value” notation.

> […]
> +/* Classifies an address.
> +
> +   ADDRESS_REG
> +       A natural register + offset address.  The register satisfies
> +       loongarch_valid_base_register_p and the offset is a const_arith_operand.
> +
> +   ADDRESS_REG_REG
> +       A base register indexed by (optionally scaled) register.
> +
> +   ADDRESS_CONST_INT
> +       A signed 16-bit constant address.
> +
> +   ADDRESS_SYMBOLIC:
> +       A constant symbolic address.  */
> +enum loongarch_address_type
> +{
> +  ADDRESS_REG,
> +  ADDRESS_REG_REG,
> +  ADDRESS_CONST_INT,
> +  ADDRESS_SYMBOLIC
> +};
> +
> +
> +/* Information about an address described by loongarch_address_type.
> +
> +   ADDRESS_CONST_INT
> +       No fields are used.
> +
> +   ADDRESS_REG
> +       REG is the base register and OFFSET is the constant offset.
> +
> +   ADDRESS_SYMBOLIC
> +       SYMBOL_TYPE is the type of symbol that the address references.  */
> +struct loongarch_address_info
> +{
> +  enum loongarch_address_type type;
> +  rtx reg;
> +  rtx offset;
> +  enum loongarch_symbol_type symbol_type;
> +};

It'd be worth documenting what the fields mean for ADDRESS_REG_REG too.
It looks like OFFSET is the index register in that case.

> […]
> +/* The value of TARGET_ATTRIBUTE_TABLE.  */
> +static const struct attribute_spec loongarch_attribute_table[] = {
> +    /* { name, min_len, max_len, decl_req, type_req, fn_type_req,
> +       affects_type_identity, handler, exclude }  */
> +      { NULL, 0, 0, false, false, false, false, NULL, NULL }
> +};

I think it'd be better to leave this and TARGET_ATTRIBUTE_TABLE
undefined until a non-terminator entry is needed.

> […]
> +/* See whether TYPE is a record whose fields should be returned in one or
> +   two floating-point registers.  If so, populate FIELDS accordingly.  */
> +
> +static unsigned
> +loongarch_pass_aggregate_in_fpr_pair_p (const_tree type,
> +

_p usually means a predicate that returns bool, and the name doesn't
really match the behaviour of the function (which as the comment says,
checks for single FPRs as well as pairs).

Maybe loongarch_pass_aggregate_num_fprs or something would be better.

> […]
> +/* See whether TYPE is a record whose fields should be returned in one or

typo: s/one or/one/

> +   floating-point register and one integer register.  If so, populate
> +   FIELDS accordingly.  */
> +
> +static bool
> +loongarch_pass_aggregate_in_fpr_and_gpr_p (const_tree type,
> +					   loongarch_aggregate_field fields[2])
> +{
> +  unsigned num_int = 0, num_float = 0;
> +  int n = loongarch_flatten_aggregate_argument (type, fields);
> +
> +  for (int i = 0; i < n; i++)
> +    {
> +      num_float += SCALAR_FLOAT_TYPE_P (fields[i].type);
> +      num_int += INTEGRAL_TYPE_P (fields[i].type);
> +    }
> +
> +  return num_int == 1 && num_float == 1;
> +}
> […]
> +static void
> +loongarch_emit_stack_tie (void)
> +{
> +  if (Pmode == SImode)
> +    emit_insn (gen_stack_tiesi (stack_pointer_rtx, hard_frame_pointer_rtx));
> +  else
> +    emit_insn (gen_stack_tiedi (stack_pointer_rtx, hard_frame_pointer_rtx));

These days it's possible to write this as:

  emit_insn (gen_stack_tie (Pmode, stack_pointer_rtx, hard_frame_pointer_rtx));

with the stack_tie insn being defined using an “@” at the start of the name:

  (define_insn "@stack_tie<mode>"

Same for the later tls patterns.

> +}
> […]
> +/* Expand the "prologue" pattern.  */
> +
> +void
> +loongarch_expand_prologue (void)
> +{
> +  struct loongarch_frame_info *frame = &cfun->machine->frame;
> +  HOST_WIDE_INT size = frame->total_size;
> +  HOST_WIDE_INT tmp;
> +  unsigned mask = frame->mask;
> +  rtx insn;
> +
> +  if (flag_stack_usage_info)
> +    current_function_static_stack_size = size;
> +
> +  if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK
> +      || flag_stack_clash_protection)
> +    {
> +      if (crtl->is_leaf && !cfun->calls_alloca)
> +	{
> +	  if (size > PROBE_INTERVAL && size > get_stack_check_protect ())
> +	    {
> +	      tmp = size - get_stack_check_protect ();
> +	      loongarch_emit_probe_stack_range (get_stack_check_protect (),
> +						tmp);
> +	    }
> +	}
> +      else if (size > 0)
> +	loongarch_emit_probe_stack_range (get_stack_check_protect (), size);
> +    }
> +
> +  /* Save the registers.  */
> +  if ((frame->mask | frame->fmask) != 0)
> +    {
> +      HOST_WIDE_INT step1 = MIN (size, loongarch_first_stack_step (frame));
> +
> +      insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx,
> +			    GEN_INT (-step1));
> +      RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
> +      size -= step1;
> +      loongarch_for_each_saved_reg (size, loongarch_save_reg);
> +    }
> +
> +  frame->mask = mask; /* Undo the above fib.  */

Is this and the “unsigned mask = frame->mask” still needed?
I couldn't see where frame->mask was changed.

> […]
> +/* Expand an "epilogue" or "sibcall_epilogue" pattern; SIBCALL_P
> +   says which.  */
> +
> +void
> +loongarch_expand_epilogue (bool sibcall_p)
> +{
> +  /* Split the frame into two.  STEP1 is the amount of stack we should
> +     deallocate before restoring the registers.  STEP2 is the amount we
> +     should deallocate afterwards.
> +
> +     Start off by assuming that no registers need to be restored.  */
> +  struct loongarch_frame_info *frame = &cfun->machine->frame;
> +  HOST_WIDE_INT step1 = frame->total_size;
> +  HOST_WIDE_INT step2 = 0;
> +  rtx ra = gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM);
> +  rtx insn;
> +
> +  /* We need to add memory barrier to prevent read from deallocated stack.  */
> +  bool need_barrier_p
> +    = (get_frame_size () + cfun->machine->frame.arg_pointer_offset) != 0;
> +
> +  if (!sibcall_p && loongarch_can_use_return_insn ())
> +    {
> +      emit_jump_insn (gen_return ());
> +      return;
> +    }
> +
> +  /* Move past any dynamic stack allocations.  */
> +  if (cfun->calls_alloca)
> +    {
> +      /* Emit a barrier to prevent loads from a deallocated stack.  */
> +      loongarch_emit_stack_tie ();
> +      need_barrier_p = false;
> +
> +      rtx adjust = GEN_INT (-frame->hard_frame_pointer_offset);
> +      if (!IMM12_OPERAND (INTVAL (adjust)))
> +	{
> +	  loongarch_emit_move (LARCH_PROLOGUE_TEMP (Pmode), adjust);
> +	  adjust = LARCH_PROLOGUE_TEMP (Pmode);
> +	}
> +
> +      insn = emit_insn (gen_add3_insn (stack_pointer_rtx,
> +				       hard_frame_pointer_rtx,
> +				       adjust));
> +
> +      rtx dwarf = NULL_RTX;
> +      rtx minus_offset = NULL_RTX;
> +      minus_offset = GEN_INT (-frame->hard_frame_pointer_offset);

The minus_offset = NULL_RTX looks redundant, might as well initialise
with the GEN_INT instead.

> […]
> +/* Return true if SYMBOL_REF X is associated with a global symbol
> +   (in the STB_GLOBAL sense).  */
> +
> +bool
> +loongarch_global_symbol_p (const_rtx x)
> +{
> +  if (GET_CODE (x) == LABEL_REF)
> +    return false;
> +
> +  const_tree decl = SYMBOL_REF_DECL (x);
> +
> +  if (!decl)
> +    return !SYMBOL_REF_LOCAL_P (x) || SYMBOL_REF_EXTERNAL_P (x);
> +
> +  /* Weakref symbols are not TREE_PUBLIC, but their targets are global
> +     or weak symbols.  Relocations in the object file will be against
> +     the target symbol, so it's that symbol's binding that matters here.  */
> +  return DECL_P (decl) && (TREE_PUBLIC (decl) || DECL_WEAK (decl));
> +}
> +
> +bool
> +loongarch_global_symbol_noweak_p (const_rtx x)
> +{
> +  if (GET_CODE (x) == LABEL_REF)
> +    return false;
> +
> +  const_tree decl = SYMBOL_REF_DECL (x);
> +
> +  if (!decl)
> +    return !SYMBOL_REF_LOCAL_P (x) || SYMBOL_REF_EXTERNAL_P (x);
> +
> +  /* Weakref symbols are not TREE_PUBLIC, but their targets are global
> +     or weak symbols.  Relocations in the object file will be against
> +     the target symbol, so it's that symbol's binding that matters here.  */

This comment doesn't apply to the noweak function.

> +  return DECL_P (decl) && TREE_PUBLIC (decl);
> +}
> […]
> +/* Return the method that should be used to access SYMBOL_REF or
> +   LABEL_REF X in context CONTEXT.  */
> +
> +static enum loongarch_symbol_type
> +loongarch_classify_symbol (const_rtx x,
> +			   enum loongarch_symbol_context context  \
> +			   ATTRIBUTE_UNUSED)
> +{
> +  if (GET_CODE (x) == LABEL_REF)
> +    {
> +      return SYMBOL_GOT_DISP;
> +    }

Minor formatting nit: redundant braces.  Some other instances
later in the file.

> +
> +  gcc_assert (GET_CODE (x) == SYMBOL_REF);
> +
> +  if (SYMBOL_REF_TLS_MODEL (x))
> +    return SYMBOL_TLS;
> +
> +  if (GET_CODE (x) == SYMBOL_REF)
> +    return SYMBOL_GOT_DISP;
> +
> +  return SYMBOL_GOT_DISP;
> +}
> +
> +/* Return true if X is a symbolic constant that can be used in context
> +   CONTEXT.  If it is, store the type of the symbol in *SYMBOL_TYPE.  */
> +
> +bool
> +loongarch_symbolic_constant_p (rtx x, enum loongarch_symbol_context context,
> +			       enum loongarch_symbol_type *symbol_type)
> +{
> +  rtx offset;
> +
> +  split_const (x, &x, &offset);
> +  if (UNSPEC_ADDRESS_P (x))
> +    {
> +      *symbol_type = UNSPEC_ADDRESS_TYPE (x);
> +      x = UNSPEC_ADDRESS (x);
> +    }
> +  else if (GET_CODE (x) == SYMBOL_REF || GET_CODE (x) == LABEL_REF)
> +    {
> +      *symbol_type = loongarch_classify_symbol (x, context);
> +      if (*symbol_type == SYMBOL_TLS)
> +	return true;
> +    }
> +  else
> +    return false;
> +
> +  if (offset == const0_rtx)
> +    return true;
> +
> +  /* Check whether a nonzero offset is valid for the underlying
> +     relocations.  */
> +  switch (*symbol_type)
> +    {
> +      /* Fall through.  */
> +

There are no longer any previous cases to fall through from.

> +    case SYMBOL_GOT_DISP:
> +    case SYMBOL_TLSGD:
> +    case SYMBOL_TLSLDM:
> +    case SYMBOL_TLS:
> +      return false;
> +    }
> +  gcc_unreachable ();
> +}
> +
> +/* Like loongarch_symbol_insns We rely on the fact that, in the worst case.  */

This comment seems incomplete.

> +
> +static int
> +loongarch_symbol_insns_1 (enum loongarch_symbol_type type, machine_mode mode)
> +{
> +  switch (type)
> +    {
> +    case SYMBOL_GOT_DISP:
> +      /* The constant will have to be loaded from the GOT before it
> +	 is used in an address.  */
> +      if (mode != MAX_MACHINE_MODE)
> +	return 0;
> +
> +      /* Fall through.  */

No (case) fall through here.  I guess the comment should just be deleted.

> +
> +      return 3;
> +
> +    case SYMBOL_TLSGD:
> +    case SYMBOL_TLSLDM:
> +      return 1;
> +
> +    case SYMBOL_TLS:
> +      /* We don't treat a bare TLS symbol as a constant.  */
> +      return 0;
> +    }
> +  gcc_unreachable ();
> +}
> +
> +/* If MODE is MAX_MACHINE_MODE, return the number of instructions needed
> +   to load symbols of type TYPE into a register.  Return 0 if the given
> +   type of symbol cannot be used as an immediate operand.
> +
> +   Otherwise, return the number of instructions needed to load or store
> +   values of mode MODE to or from addresses of type TYPE.  Return 0 if
> +   the given type of symbol is not valid in addresses.
> +
> +   In both cases, instruction counts are based off BASE_INSN_LENGTH.  */
> +
> +static int
> +loongarch_symbol_insns (enum loongarch_symbol_type type, machine_mode mode)
> +{
> +  return loongarch_symbol_insns_1 (type, mode) * (1);

“* (1)” looks odd, maybe just remove.

> +}
> […]
> +/* Return true if, for every base register BASE_REG, (plus BASE_REG X)
> +   can address a value of mode MODE.  */
> +
> +static bool
> +loongarch_valid_offset_p (rtx x, machine_mode mode)
> +{
> +  /* Check that X is a signed 12-bit number,
> +   * or check that X is a signed 16-bit number
> +   * and offset 4 byte aligned.  */

Minor nit, but it isn't GNU style to indent comment lines with a leading “*”.
(This was the only instance I could see in the patch.)

> +  if (!(const_arith_operand (x, Pmode)
> +	|| ((mode == E_SImode || mode == E_DImode)
> +	    && const_imm16_operand (x, Pmode)
> +	    && (loongarch_signed_immediate_p (INTVAL (x), 14, 2)))))
> +    return false;
> +
> +  /* We may need to split multiword moves, so make sure that every word
> +     is accessible.  */
> +  if (GET_MODE_SIZE (mode) > UNITS_PER_WORD
> +      && !IMM12_OPERAND (INTVAL (x) + GET_MODE_SIZE (mode) - UNITS_PER_WORD))
> +    return false;
> +
> +  return true;
> +}
> +
> […]
> +/* Return the number of instructions needed to load constant X,
> +   assuming that BASE_INSN_LENGTH is the length of one instruction.
> +   Return 0 if X isn't a valid constant.  */
> +
> +int
> +loongarch_const_insns (rtx x)
> +{
> +  enum loongarch_symbol_type symbol_type;
> +  rtx offset;
> +
> +  switch (GET_CODE (x))
> +    {
> +    case CONST_INT:
> +      return loongarch_integer_cost (INTVAL (x));
> +
> +    case CONST_VECTOR:
> +      /* Fall through.  */
> +    case CONST_DOUBLE:
> +      /* Allow zeros for normal mode, where we can use $0.  */

“normal mode” is a holdover from MIPS (where it meant “not MIPS16
and not micromips”).

> +      return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
> +
> +    case CONST:
> +      /* See if we can refer to X directly.  */
> +      if (loongarch_symbolic_constant_p (x, SYMBOL_CONTEXT_LEA, &symbol_type))
> +	return loongarch_symbol_insns (symbol_type, MAX_MACHINE_MODE);
> +
> +      /* Otherwise try splitting the constant into a base and offset.
> +	 If the offset is a 12-bit value, we can load the base address
> +	 into a register and then use ADDI.{W/D} to add in the offset.
> +	 If the offset is larger, we can load the base and offset
> +	 into separate registers and add them together with ADD.{W/D}.
> +	 However, the latter is only possible before reload; during
> +	 and after reload, we must have the option of forcing the
> +	 constant into the pool instead.  */
> +      split_const (x, &x, &offset);
> +      if (offset != 0)
> +	{
> +	  int n = loongarch_const_insns (x);
> +	  if (n != 0)
> +	    {
> +	      if (IMM12_INT (offset))
> +		return n + 1;
> +	      else if (!targetm.cannot_force_const_mem (GET_MODE (x), x))
> +		return n + 1 + loongarch_integer_cost (INTVAL (offset));
> +	    }
> +	}
> +      return 0;
> +
> +    case SYMBOL_REF:
> +    case LABEL_REF:
> +      return loongarch_symbol_insns (
> +	loongarch_classify_symbol (x, SYMBOL_CONTEXT_LEA), MAX_MACHINE_MODE);
> +
> +    default:
> +      return 0;
> +    }
> +}
> +
> +/* X is a doubleword constant that can be handled by splitting it into
> +   two words and loading each word separately.  Return the number of
> +   instructions required to do this, assuming that BASE_INSN_LENGTH
> +   is the length of one instruction.  */

The BASE_INSN_LENGTH part no longer applies.  Same for later functions.

> […]
> +/* Return true if there is a instruction that implements CODE
> +   and if that instruction accepts X as an immediate operand.  */
> +
> +static int
> +loongarch_immediate_operand_p (int code, HOST_WIDE_INT x)
> +{
> +  switch (code)
> +    {
> +    case ASHIFT:
> +    case ASHIFTRT:
> +    case LSHIFTRT:
> +      /* All shift counts are truncated to a valid constant.  */
> +      return true;
> +
> +    case ROTATE:
> +    case ROTATERT:
> +      /* Likewise rotates, if the target supports rotates at all.  */

The comment doesn't really apply to Loongson, where rotates are
unconditionally available.

> […]
> +/* Return the cost of sign-extending OP to mode MODE, not including the
> +   cost of OP itself.  */
> +
> +static int
> +loongarch_sign_extend_cost (machine_mode mode, rtx op)
> +{
> +  if (MEM_P (op))
> +    /* Extended loads are as cheap as unextended ones.  */
> +    return 0;
> +
> +  if (TARGET_64BIT && mode == DImode && GET_MODE (op) == SImode)
> +    /* A sign extension from SImode to DImode in 64-bit mode is free.  */
> +    return 0;

Coming back to the TRULY_NOOP_TRUNCATION question from patch 4:
this would no longer be true if the definition of TRULY_NOOP_TRUNCATION
were removed.  On balance, though, I'd expect removing the definition
to improve the quality of the generated code.

> +
> +  return COSTS_N_INSNS (1);
> +}
> […]
> +/* Return one word of double-word value OP, taking into account the fixed
> +   endianness of certain registers.  HIGH_P is true to select the high part,
> +   false to select the low part.  */
> +
> +rtx
> +loongarch_subword (rtx op, bool high_p)
> +{
> +  unsigned int byte, offset;
> +  machine_mode mode;
> +
> +  mode = GET_MODE (op);
> +  if (mode == VOIDmode)
> +    mode = TARGET_64BIT ? TImode : DImode;
> +
> +  if (high_p)
> +    byte = UNITS_PER_WORD;
> +  else
> +    byte = 0;
> +
> +  if (FP_REG_RTX_P (op))
> +    {
> +      /* Paired FPRs are always ordered little-endian.  */
> +      offset = (UNITS_PER_WORD < UNITS_PER_HWFPVALUE ? high_p : byte != 0);
> +      return gen_rtx_REG (word_mode, REGNO (op) + offset);
> +    }

Do you need this?  Loongson seems to be little-endian only, so this…

> +
> +  if (MEM_P (op))
> +    return loongarch_rewrite_small_data (adjust_address (op, word_mode, byte));
> +
> +  return simplify_gen_subreg (word_mode, op, mode, byte);

…default behaviour should work.

> +}
> +
> […]
> +/* Forward declaration.  Used below.  */
> +static HOST_WIDE_INT
> +loongarch_constant_alignment (const_tree exp, HOST_WIDE_INT align);

It'd be better to move the definition further up, since there isn't a cyclic
dependency.

> […]
> +}
> +/* Return the appropriate instructions to move SRC into DEST.  Assume
> +   that SRC is operand 1 and DEST is operand 0.  */

Nit: missing newline before the comment.

> +
> +const char *
> +loongarch_output_move (rtx dest, rtx src)
> +{
> […]
> +      if (symbolic_operand (src, VOIDmode))
> +	{
> +	  if ((TARGET_CMODEL_TINY && (!loongarch_global_symbol_p (src)
> +				      || loongarch_symbol_binds_local_p (src)))
> +	      || (TARGET_CMODEL_TINY_STATIC && !loongarch_weak_symbol_p (src)))
> +	    {
> +	      /* The symbol must be aligned to 4 byte.  */
> +	      unsigned int align;
> +
> +	      if (GET_CODE (src) == LABEL_REF)
> +		align = 128 /* Whatever.  */;

128 seems a strange choice.  Couldn't it just be 32 (the number of bits
in an instruction) given the later use?

> +	      else if (CONSTANT_POOL_ADDRESS_P (src))
> +		align = GET_MODE_ALIGNMENT (get_pool_mode (src));
> +	      else if (TREE_CONSTANT_POOL_ADDRESS_P (src))
> +		{
> +		  tree exp = SYMBOL_REF_DECL (src);
> +		  align = TYPE_ALIGN (TREE_TYPE (exp));
> +		  align = loongarch_constant_alignment (exp, align);
> +		}
> +	      else if (SYMBOL_REF_DECL (src))
> +		align = DECL_ALIGN (SYMBOL_REF_DECL (src));
> +	      else if (SYMBOL_REF_HAS_BLOCK_INFO_P (src)
> +		       && SYMBOL_REF_BLOCK (src) != NULL)
> +		align = SYMBOL_REF_BLOCK (src)->alignment;
> +	      else
> +		align = BITS_PER_UNIT;
> +
> +	      if (align % (4 * 8) == 0)
> +		return "pcaddi\t%0,%%pcrel(%1)>>2";
> +	    }
> +	  if (TARGET_CMODEL_TINY
> +	      || TARGET_CMODEL_TINY_STATIC
> +	      || TARGET_CMODEL_NORMAL
> +	      || TARGET_CMODEL_LARGE)
> +	    {
> +	      if (!loongarch_global_symbol_p (src)
> +		  || loongarch_symbol_binds_local_p (src))
> +		return "la.local\t%0,%1";
> +	      else
> +		return "la.global\t%0,%1";
> +	    }
> +	  if (TARGET_CMODEL_EXTREME)
> +	    {
> +	      sorry ("Normal symbol loading not implemented in extreme mode.");
> +	      /* GCC complains.  */
> +	      /* return ""; */

Similar comments here to the review of part 4: sorry () isn't a fatal error,
so the function still needs to return something.  Is this a situation
that can be triggered by specific source code, or is it an internal
error that the user should never see?

> […]
> +/* Like riscv_emit_int_compare, but for floating-point comparisons.  */

s/riscv/loongarch/

> +
> +static void
> +loongarch_emit_float_compare (enum rtx_code *code, rtx *op0, rtx *op1)
> +{
> […]
> +/* Implement TARGET_ASM_FUNCTION_RODATA_SECTION.
> +
> +   The complication here is that, with the combination
> +   !TARGET_ABSOLUTE_ABICALLS , jump tables will use
> +   absolute addresses, and should therefore not be included in the
> +   read-only part of a DSO.  Handle such cases by selecting a normal
> +   data section instead of a read-only one.  The logic apes that in
> +   default_function_rodata_section.  */

The !TARGET_ABSOLUTE_ABICALLS bit doesn't apply to Loongson.
I think you can just say “The complication here is that jump tables…”.

> +
> +static section *
> +loongarch_function_rodata_section (tree decl, bool)
> +{
> +  return default_function_rodata_section (decl, false);
> +}
> […]
> +/* Implement TARGET_USE_ANCHORS_FOR_SYMBOL_P.  We don't want to use
> +   anchors for small data: the GP register acts as an anchor in that
> +   case.  We also don't want to use them for PC-relative accesses,
> +   where the PC acts as an anchor.  */
> +
> +static bool
> +loongarch_use_anchors_for_symbol_p (const_rtx symbol)
> +{
> +  return default_use_anchors_for_symbol_p (symbol);
> +}

The comment doesn't describe the behaviour.  Probably easiest to leave
this hook undefined.

> […]
> +/* Implement TARGET_DWARF_FRAME_REG_MODE.  */
> +
> +static machine_mode
> +loongarch_dwarf_frame_reg_mode (int regno)
> +{
> +  machine_mode mode = default_dwarf_frame_reg_mode (regno);
> +
> +  return mode;
> +}

This is the default behaviour, so it can be deleted.

> […]
> +#ifdef ASM_OUTPUT_SIZE_DIRECTIVE
> +extern int size_directive_output;
> +
> +/* Implement ASM_DECLARE_OBJECT_NAME.  This is like most of the standard ELF
> +   definitions except that it uses loongarch_declare_object to emit the label.
> +*/
> +
> +void
> +loongarch_declare_object_name (FILE *stream, const char *name,
> +			       tree decl ATTRIBUTE_UNUSED)
> +{
> +#ifdef ASM_OUTPUT_TYPE_DIRECTIVE
> +#ifdef USE_GNU_UNIQUE_OBJECT

Are these preprocessor conditions necessary for Loongson?  I would have
expected things to be more regular than they were for MIPS.

(Also applies to the ASM_OUTPUT_SIZE_DIRECTIVE condition above,
and to similar conditions later.)

> +  /* As in elfos.h.  */
> +  if (USE_GNU_UNIQUE_OBJECT && DECL_ONE_ONLY (decl)
> +      && (!DECL_ARTIFICIAL (decl) || !TREE_READONLY (decl)))
> +    ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "gnu_unique_object");
> +  else
> +#endif
> +    ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "object");
> +#endif
> +
> +  size_directive_output = 0;
> +  if (!flag_inhibit_size_directive && DECL_SIZE (decl))
> +    {
> +      HOST_WIDE_INT size;
> +
> +      size_directive_output = 1;
> +      size = int_size_in_bytes (TREE_TYPE (decl));
> +      ASM_OUTPUT_SIZE_DIRECTIVE (stream, name, size);
> +    }
> +
> +  loongarch_declare_object (stream, name, "", ":\n");
> +}
> […]
> +/* Implement TARGET_ASM_FILE_START.  */
> +
> +static void
> +loongarch_file_start (void)
> +{
> +  default_file_start ();
> +}

This is the default behaviour, so it can be deleted.

> […]
> +bool
> +loongarch_hard_regno_rename_ok (unsigned int old_reg ATTRIBUTE_UNUSED,
> +				unsigned int new_reg ATTRIBUTE_UNUSED)
> +{
> +  return true;
> +}

Same for this.

> […]
> +/* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
> +
> +static bool
> +loongarch_can_change_mode_class (machine_mode from, machine_mode to,
> +				 reg_class_t rclass)
> +{
> +  /* Allow conversions between different Loongson integer vectors,
> +     and between those vectors and DImode.  */
> +  if (GET_MODE_SIZE (from) == 8 && GET_MODE_SIZE (to) == 8
> +      && INTEGRAL_MODE_P (from) && INTEGRAL_MODE_P (to))
> +    return true;

Is the comment accurate?  There didn't seem to be any integer vector support.

> +
> +  return !reg_classes_intersect_p (FP_REGS, rclass);
> +}
> […]
> +/* Implement TARGET_SECONDARY_MEMORY_NEEDED.
> +
> +   This can be achieved using MOVFRH2GR.S/MOVGR2FRH.W when these instructions
> +   are available but otherwise moves must go via memory.  Using
> +   MOVGR2FR/MOVFR2GR to access the lower-half of these registers would require
> +   a forbidden single-precision access.  We require all double-word moves to
> +   use memory because adding even and odd floating-point registers classes
> +   would have a significant impact on the backend.  */
> +
> +static bool
> +loongarch_secondary_memory_needed (machine_mode mode ATTRIBUTE_UNUSED,
> +				   reg_class_t class1, reg_class_t class2)
> +{
> +  /* Ignore spilled pseudos.  */
> +  if (lra_in_progress && (class1 == NO_REGS || class2 == NO_REGS))
> +    return false;
> +
> +  return false;
> +}

Looks like this hook definition can be deleted (which is a good sign!)

> […]
> +/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
> +
> +static machine_mode
> +loongarch_preferred_simd_mode (scalar_mode mode ATTRIBUTE_UNUSED)
> +{
> +  return word_mode;
> +}

This is the default behaviour, so it can be deleted.

> […]
> +/* Return the number of instructions that can be issued per cycle.  */
> +
> +static int
> +loongarch_issue_rate (void)
> +{
> +  if ((unsigned long) __ACTUAL_TUNE < N_TUNE_TYPES)
> +    return loongarch_cpu_issue_rate[__ACTUAL_TUNE];
> +  else
> +    return 1;
> +}

Following on from a comment about patch 2: __FOO macros are in the
implementation namespace, so __ACTUAL_TUNE should be something like
LARCH_ACTUAL_TUNE instead.

> […]
> +/* A structure representing the state of the processor pipeline.
> +   Used by the loongarch_sim_* family of functions.  */
> +
> +struct loongarch_sim
> +{
> +  /* The maximum number of instructions that can be issued in a cycle.
> +     (Caches loongarch_issue_rate.)  */
> +  unsigned int issue_rate;
> +
> +  /* The current simulation time.  */
> +  unsigned int time;
> +
> +  /* How many more instructions can be issued in the current cycle.  */
> +  unsigned int insns_left;
> +
> +  /* LAST_SET[X].INSN is the last instruction to set register X.
> +     LAST_SET[X].TIME is the time at which that instruction was issued.
> +     INSN is null if no instruction has yet set register X.  */
> +  struct
> +  {
> +    rtx_insn *insn;
> +    unsigned int time;
> +  } last_set[FIRST_PSEUDO_REGISTER];
> +
> +  /* The pipeline's current DFA state.  */
> +  state_t dfa_state;
> +};
> +
> +/* Reset STATE to the initial simulation state.  */
> +
> +static void
> +loongarch_sim_reset (struct loongarch_sim *state)
> +{
> +  curr_state = state->dfa_state;
> +
> +  state->time = 0;
> +  state->insns_left = state->issue_rate;
> +  memset (&state->last_set, 0, sizeof (state->last_set));
> +  state_reset (curr_state);
> +
> +  targetm.sched.init (0, false, 0);
> +  advance_state (curr_state);
> +}
> +
> +/* Initialize STATE before its first use.  DFA_STATE points to an
> +   allocated but uninitialized DFA state.  */
> +
> +static void
> +loongarch_sim_init (struct loongarch_sim *state, state_t dfa_state)
> +{
> +  if (targetm.sched.init_dfa_pre_cycle_insn)
> +    targetm.sched.init_dfa_pre_cycle_insn ();
> +
> +  if (targetm.sched.init_dfa_post_cycle_insn)
> +    targetm.sched.init_dfa_post_cycle_insn ();
> +
> +  state->issue_rate = loongarch_issue_rate ();
> +  state->dfa_state = dfa_state;
> +  loongarch_sim_reset (state);
> +}
> +
> +/* Set up costs based on the current architecture and tuning settings.  */
> +
> +static void
> +loongarch_set_tuning_info (void)
> +{
> +  dfa_start ();
> +
> +  struct loongarch_sim state;
> +  loongarch_sim_init (&state, alloca (state_size ()));
> +
> +  dfa_finish ();
> +}
> +/* Implement TARGET_EXPAND_TO_RTL_HOOK.  */
> +
> +static void
> +loongarch_expand_to_rtl_hook (void)
> +{
> +  /* We need to call this at a point where we can safely create sequences
> +     of instructions, so TARGET_OVERRIDE_OPTIONS is too early.  We also
> +     need to call it at a point where the DFA infrastructure is not
> +     already in use, so we can't just call it lazily on demand.
> +
> +     At present, loongarch_tuning_info is only needed during post-expand
> +     RTL passes such as split_insns, so this hook should be early enough.
> +     We may need to move the call elsewhere if loongarch_tuning_info starts
> +     to be used for other things (such as rtx_costs, or expanders that
> +     could be called during gimple optimization).  */
> +
> +  loongarch_set_tuning_info ();
> +}

Do you need the code above?  It doesn't seem to be using the
simulator state.

IIRC, on MIPS this was originally used to pad instructions in a
VLIW-like way for NEC's VR4131.

> […]
> +static void
> +loongarch_option_override_internal (struct gcc_options *opts,
> +				    struct gcc_options *opts_set);

Here too it would be better to rearrange the functions to avoid
the forward declaration.

> +
> +/* Implement TARGET_OPTION_OVERRIDE.  */
> +
> +static void
> +loongarch_option_override (void)
> +{
> +  loongarch_option_override_internal (&global_options, &global_options_set);
> +}
> +
> +static void
> +loongarch_option_override_internal (struct gcc_options *opts,
> +				    struct gcc_options *opts_set)
> +{
> +  int i, regno, mode;
> +
> +  (void) opts_set;

Might as well just drop the argument name from the prototype instead.

> +
> +  if (flag_pic)
> +    g_switch_value = 0;
> +
> +  /* Handle target-specific options: compute defaults/conflicts etc.  */
> +  loongarch_config_target (&la_target, la_opt_switches,
> +			   la_opt_cpu_arch, la_opt_cpu_tune, la_opt_fpu,
> +			   la_opt_abi_base, la_opt_abi_ext, la_opt_cmodel, 0);
> +
> +  /* End of code shared with GAS.  */

Just checking: is this code shared with GAS, in the way that the
MIPS code was?  (Genuine question, since it might be.)

> […]
> +/* Implement TARGET_SHIFT_TRUNCATION_MASK.  We want to keep the default
> +   behavior of TARGET_SHIFT_TRUNCATION_MASK for non-vector modes.  */
> +
> +static unsigned HOST_WIDE_INT
> +loongarch_shift_truncation_mask (machine_mode mode)
> +{
> +  return GET_MODE_BITSIZE (mode) - 1;
> +}

It shouldn't be necessary to define this hook, given that
SHIFT_COUNT_TRUNCATED is 1.  (The reason MIPS had to define
it was that vector shifts didn't truncate in the same way.)

> +
> +/* Implement TARGET_SCHED_REASSOCIATION_WIDTH.  */
> +
> +static int
> +loongarch_sched_reassociation_width (unsigned int opc ATTRIBUTE_UNUSED,
> +				     machine_mode mode ATTRIBUTE_UNUSED)
> +{
> +  return 1;
> +}

This also shouldn't be necessary, since it matches the default.

> […]
> +/* Implement TARGET_CASE_VALUES_THRESHOLD.  */
> +
> +unsigned int
> +loongarch_case_values_threshold (void)
> +{
> +  return default_case_values_threshold ();
> +}

This is also the default.

> […]
> +/* All our function attributes are related to how out-of-line copies should
> +   be compiled or called.  They don't in themselves prevent inlining.  */
> +#undef TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P
> +#define TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P hook_bool_const_tree_true

This shouldn't be necessary, since there are no target-specific
attributes.

> […]
> diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
> new file mode 100644
> index 00000000000..791f9d064c5
> --- /dev/null
> +++ b/gcc/config/loongarch/loongarch.h
> @@ -0,0 +1,1273 @@
> +/* Definitions of target machine for GNU compiler.  LoongArch version.
> +   Copyright (C) 2021-2022 Free Software Foundation, Inc.
> +   Contributed by Loongson Ltd.
> +   Based on MIPS and RISC-V target for GNU compiler.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +/* LoongArch external variables defined in loongarch.c.  */

longarch.cc now :-)

> […]
> +/* Support for a compile-time default CPU, et cetera.  The rules are:
> +   --with-divide is ignored if -mdivide-traps or -mdivide-breaks are
> +     specified.  */
> +#define OPTION_DEFAULT_SPECS \
> +  {"divide", "%{!mdivide-traps:%{!mdivide-breaks:-mdivide-%(VALUE)}}"},

Loongson doesn't have the -mdivide-* options, so I think this can
be deleted.

> […]
> +/* This definition replaces the formerly used 'm' constraint with a
> +   different constraint letter in order to avoid changing semantics of
> +   the 'm' constraint when accepting new address formats in
> +   TARGET_LEGITIMATE_ADDRESS_P.  The constraint letter defined here
> +   must not be used in insn definitions or inline assemblies.  */
> +#define TARGET_MEM_CONSTRAINT 'w'

Do you need to do this for a new port like Loongson?  It looks like
TARGET_LEGITIMATE_ADDRESS_P accepts all valid forms of address,
including reg+reg (good!), so shouldn't "m" accept them too?

If there is already existing code that assumes that "m" is never indexed
then the definition obviously makes sense.  But if you don't know of any
such code then it would be better to make "m" accept all the things that
TARGET_LEGITIMATE_ADDRESS_P does (by removing this definition).

> +
> +/* Tell collect what flags to pass to nm.  */
> +#ifndef NM_FLAGS
> +#define NM_FLAGS "-Bn"
> +#endif
> +
> +/* SUBTARGET_ASM_DEBUGGING_SPEC handles passing debugging options to
> +   the assembler.  It may be overridden by subtargets.  */
> +
> +#ifndef SUBTARGET_ASM_DEBUGGING_SPEC
> +#define SUBTARGET_ASM_DEBUGGING_SPEC "\
> +%{g} %{g0} %{g1} %{g2} %{g3} \
> +%{ggdb:-g} %{ggdb0:-g0} %{ggdb1:-g1} %{ggdb2:-g2} %{ggdb3:-g3} \
> +%{gstabs:-g} %{gstabs0:-g0} %{gstabs1:-g1} %{gstabs2:-g2} %{gstabs3:-g3} \
> +%{gstabs+:-g} %{gstabs+0:-g0} %{gstabs+1:-g1} %{gstabs+2:-g2} %{gstabs+3:-g3}"

It doesn't look like this and %(subtarget_asm_debugging_spec)
(from EXTRA_SPECS) are used.  That's a good thing, since the stabs
stuff shouldn't be needed.

Probably best to remove this definition and the EXTRA_SPECS entry.

> +#endif
> […]
> +/* The number of consecutive floating-point registers needed to store the
> +   largest format supported by the FPU.  */
> +#define MAX_FPRS_PER_FMT 1
> +
> +/* The number of consecutive floating-point registers needed to store the
> +   smallest format supported by the FPU.  */
> +#define MIN_FPRS_PER_FMT 1

It might be clearer to drop this abstraction for Loongson, given that
it doesn't have the MIPS restrictions around 32-bit FPR pairs.

> […]
> +#define CALLEE_SAVED_REG_NUMBER(REGNO) \
> +  ((REGNO) >= 22 && (REGNO) <= 31 ? (REGNO) -22 : -1)

Formatting nit: missing space between “-” and “22”.

> […]
> +enum reg_class
> +{
> +  NO_REGS,	  /* no registers in set  */
> +  SIBCALL_REGS,	  /* SIBCALL_REGS  */
> +  JIRL_REGS,	  /* JIRL_REGS  */

These two comments don't really say anything.  It would be worth
describing them briefly in words.  E.g.: I assume SIBCALL_REGS
is something like:

  /* registers that can hold an indirect sibcall address.  */

> +  GR_REGS,	  /* integer registers  */
> +  CSR_REGS,	  /* integer registers except for $r0 and $r1 for lcsr.  */

Normally it's better to list supersets after subsets, so I think these
two should be the other way around.  Swapping them would mean updating
the register class macros too, of course.

> +  FP_REGS,	  /* floating point registers  */
> +  FCC_REGS,	  /* status registers (fp status)  */
> +  FRAME_REGS,	  /* arg pointer and frame pointer  */
> +  ALL_REGS,	  /* all registers  */
> +  LIM_REG_CLASSES /* max value + 1  */
> +};
> […]
> +#define CONST_LOW_PART(VALUE) ((VALUE) -CONST_HIGH_PART (VALUE))

Formatting nit: missing space after “-”.

> […]
> +#define ELIMINABLE_REGS \
> +  { \
> +    {ARG_POINTER_REGNUM, STACK_POINTER_REGNUM}, \
> +      {ARG_POINTER_REGNUM, HARD_FRAME_POINTER_REGNUM}, \
> +      {FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM}, \
> +      {FRAME_POINTER_REGNUM, HARD_FRAME_POINTER_REGNUM}, \

Very minor, sorry, but: these last three lines look over-indented.

> +  }
> […]
> +/* This structure has to cope with two different argument allocation
> +   schemes.  Most LoongArch ABIs view the arguments as a structure, of which
> +   the first N words go in registers and the rest go on the stack.  If I
> +   < N, the Ith word might go in Ith integer argument register or in a
> +   floating-point register.  For these ABIs, we only need to remember
> +   the offset of the current argument into the structure.
> +
> +   So for the standard ABIs, the first N words are allocated to integer
> +   registers, and loongarch_function_arg decides on an argument-by-argument
> +   basis whether that argument should really go in an integer register,
> +   or in a floating-point one.  */

Does the comment apply to Loongson?  It seemed like there was only
really one allocation scheme (excepting soft-float vs. hard-float,
but I don't think the comment was referring to that).

> +
> +typedef struct loongarch_args
> +{
> +  /* Number of integer registers used so far, up to MAX_ARGS_IN_REGISTERS.  */
> +  unsigned int num_gprs;
> +
> +  /* Number of floating-point registers used so far, likewise.  */
> +  unsigned int num_fprs;
> +
> +} CUMULATIVE_ARGS;
> +
> […]
> +/* Output assembler code to FILE to increment profiler label # LABELNO
> +   for profiling a function entry.  */
> +
> +#define MCOUNT_NAME "_mcount"

This comment looks like a remnant from when profiling code was
injected as asm text.  I think it can be deleted now.

> […]
> +#define CASE_VECTOR_MODE Pmode
> +
> +/* Only use short offsets if their range will not overflow.  */
> +#define CASE_VECTOR_SHORTEN_MODE(MIN, MAX, BODY) Pmode

The comment doesn't seem to match the code.

> […]
> +struct GTY (()) machine_function
> +{
> +  /* The next floating-point condition-code register to allocate
> +     for 8CC targets, relative to FCC_REG_FIRST.  */
> +  unsigned int next_fcc;
> +
> +  /* The number of extra stack bytes taken up by register varargs.
> +     This area is allocated by the callee at the very top of the frame.  */
> +  int varargs_size;
> +
> +  /* The current frame information, calculated by loongarch_compute_frame_info.
> +   */
> +  struct loongarch_frame_info frame;
> +
> +  /* True if at least one of the formal parameters to a function must be
> +     written to the frame header (probably so its address can be taken).  */
> +  bool does_not_use_frame_header;
> +
> +  /* True if none of the functions that are called by this function need
> +     stack space allocated for their arguments.  */
> +  bool optimize_call_stack;
> +
> +  /* True if one of the functions calling this function may not allocate
> +     a frame header.  */
> +  bool callers_may_not_allocate_frame;

The last three fields seem to be unused.

> +};
> +#endif
> +
> +/* As on most targets, we want the .eh_frame section to be read-only where
> +   possible.  And as on most targets, this means two things:
> +
> +     (a) Non-locally-binding pointers must have an indirect encoding,
> +	 so that the addresses in the .eh_frame section itself become
> +	 locally-binding.
> +
> +     (b) A shared library's .eh_frame section must encode locally-binding
> +	 pointers in a relative (relocation-free) form.
> +
> +   However, LoongArch has traditionally not allowed directives like:
> +
> +	.long	x-.
> +
> +   in cases where "x" is in a different section, or is not defined in the
> +   same assembly file.  We are therefore unable to emit the PC-relative
> +   form required by (b) at assembly time.

Is this really true for LoongArch?  It should be possible to use a more
efficient definition on modern targets.

> +
> +   Fortunately, the linker is able to convert absolute addresses into
> +   PC-relative addresses on our behalf.  Unfortunately, only certain
> +   versions of the linker know how to do this for indirect pointers,
> +   and for personality data.  We must fall back on using writable
> +   .eh_frame sections for shared libraries if the linker does not
> +   support this feature.  */
> +#define ASM_PREFERRED_EH_DATA_FORMAT(CODE, GLOBAL) \
> +  (((GLOBAL) ? DW_EH_PE_indirect : 0) | DW_EH_PE_absptr)
> +
> +/* Several named LoongArch patterns depend on Pmode.  These patterns have the
> +   form <NAME>_si for Pmode == SImode and <NAME>_di for Pmode == DImode.
> +   Add the appropriate suffix to generator function NAME and invoke it
> +   with arguments ARGS.  */
> +#define PMODE_INSN(NAME, ARGS) \
> +  (Pmode == SImode ? NAME##_si ARGS : NAME##_di ARGS)

FWIW, the "@” .md notation mentioned above would be a more modern
way of doing this.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 04/12] LoongArch Port: Machine description files.
  2022-03-04  7:18 ` [PATCH v8 04/12] LoongArch Port: Machine description files xuchenghua
  2022-03-06 16:16   ` Richard Sandiford
@ 2022-03-07 20:15   ` Xi Ruoyao
  2022-03-10  6:26     ` 程璐璐
  1 sibling, 1 reply; 34+ messages in thread
From: Xi Ruoyao @ 2022-03-07 20:15 UTC (permalink / raw)
  To: xuchenghua, gcc-patches; +Cc: chenglulu, joseph

On Fri, 2022-03-04 at 15:18 +0800, xuchenghua@loongson.cn wrote:

>         * config/loongarch/loongarch.md: New file.

An ICE happens building OpenSSH-8.9p1.  Investigation shows it's caused
by the flag "-fzero-call-used-regs=".  It's because the compiler
attempts to clear FCCx registers but can't figure out how.

This flag also triggers ICE for other targets (for example, PR 104820
for MIPS), and the related tests (zero-scratch-regs-{8,9,10,11}.c) are
marked dg-skip for many targets.

But it's unfortunate that packages like OpenSSH have already start to
use this flag... I guess they just enabled it once they saw it was
working for i386 :(.  So it's better to solve the problem for a new
target.

A "quick fix" is adding an insn to clear FCCx.  This is enough to build
OpenSSH and make zero-scratch-regs-{8,9,10,11}.c PASS.

diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index a9a8bc4b038..76c5ded9fe4 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2020,6 +2020,12 @@
   DONE;
 })
 
+;; Clear one FCC register
+(define_insn "movfcc" [(set (match_operand:FCC 0 "register_operand" "=z")
+                     (const_int 0))]
+  ""
+  "movgr2cf\t%0,$r0")
+
 ;; Conditional move instructions.
 
 (define_insn "*sel<code><GPR:mode>_using_<GPR2:mode>"
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 06/12] LoongArch Port: Builtin functions.
  2022-03-04  7:18 ` [PATCH v8 06/12] LoongArch Port: Builtin functions xuchenghua
@ 2022-03-08 18:43   ` Richard Sandiford
  0 siblings, 0 replies; 34+ messages in thread
From: Richard Sandiford @ 2022-03-08 18:43 UTC (permalink / raw)
  To: xuchenghua; +Cc: gcc-patches, chenglulu, joseph

xuchenghua@loongson.cn writes:
> +#ifndef _GCC_LOONGARCH_BASE_INTRIN_H
> +#define _GCC_LOONGARCH_BASE_INTRIN_H
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +typedef struct drdtime
> +{
> +  unsigned long dvalue;
> +  unsigned long dtimeid;
> +} __drdtime_t;
> +
> +typedef struct rdtime
> +{
> +  unsigned int value;
> +  unsigned int timeid;
> +} __rdtime_t;
> +
> +#ifdef __loongarch64
> +extern __inline __drdtime_t
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +__builtin_loongarch_rdtime_d (void)
> +{
> +  __drdtime_t drdtime;
> +  __asm__ volatile (
> +    "rdtime.d\t%[val],%[tid]\n\t"
> +    : [val]"=&r"(drdtime.dvalue),[tid]"=&r"(drdtime.dtimeid)
> +    :);
> +  return drdtime;

It's usually better to use __foo names for local variables and
parameters, in case the user defines a macro called (in this case)
drdtime.

> +}
> +#define __rdtime_d __builtin_loongarch_rdtime_d

Are both of these names “public”?  In other words, can users use
__builtin_longarch_rdtime_d directly, instead of using __rdtime_d?

If only __rdtime_d is public then it might be better to define
the function directly, since that will give better error messages.

> […]
> +#if defined __loongarch64
> +/* Assembly instruction format:	ui5, rj, si12.  */
> +/* Data types in instruction templates:  VOID, USI, UDI, SI.  */
> +#define __dcacop(/*ui5*/ _1, /*unsigned long int*/ _2, /*si12*/ _3) \
> +  ((void) __builtin_loongarch_dcacop ((_1), (unsigned long int) (_2), (_3)))
> +#else
> +#error "Don't support this ABI."

“Unsupported ABI” might be better.  Same for the rest of the file.

> +#endif
> […]
> +/* Invoke MACRO (COND) for each fcmp.cond.{s/d} condition.  */
> +#define LARCH_FP_CONDITIONS(MACRO) \
> +  MACRO (f),	\
> +  MACRO (un),	\
> +  MACRO (eq),	\
> +  MACRO (ueq),	\
> +  MACRO (olt),	\
> +  MACRO (ult),	\
> +  MACRO (ole),	\
> +  MACRO (ule),	\
> +  MACRO (sf),	\
> +  MACRO (ngle),	\
> +  MACRO (seq),	\
> +  MACRO (ngl),	\
> +  MACRO (lt),	\
> +  MACRO (nge),	\
> +  MACRO (le),	\
> +  MACRO (ngt)
> +
> +/* Enumerates the codes above as LARCH_FP_COND_<X>.  */
> +#define DECLARE_LARCH_COND(X) LARCH_FP_COND_##X
> +enum loongarch_fp_condition
> +{
> +  LARCH_FP_CONDITIONS (DECLARE_LARCH_COND)
> +};
> +#undef DECLARE_LARCH_COND
> +
> +/* Index X provides the string representation of LARCH_FP_COND_<X>.  */
> +#define STRINGIFY(X) #X
> +const char *const
> +loongarch_fp_conditions[16]= {LARCH_FP_CONDITIONS (STRINGIFY)};
> +#undef STRINGIFY

It doesn't look like the code above is needed, since none of the current
built-ins have a condition code attached.

Same applies to the later “cond” field and related comments.

> +
> +/* Declare an availability predicate for built-in functions that require
> + * COND to be true.  NAME is the main part of the predicate's name.  */

Formatting nit: GNU style is not to have the “*” at the start
of the line.

> +#define AVAIL_ALL(NAME, COND) \
> +  static unsigned int \
> +  loongarch_builtin_avail_##NAME (void) \
> +  { \
> +    return (COND) ? 1 : 0; \
> +  }
> +
> +static unsigned int
> +loongarch_builtin_avail_default (void)
> +{
> +  return 1;
> +}
> +/* This structure describes a single built-in function.  */
> +struct loongarch_builtin_description

Very minor nit, sorry, but: missing blank line before the comment.

> […]
> +/* Loongson support crc.  */
> +#define CODE_FOR_loongarch_crc_w_b_w CODE_FOR_crc_w_b_w
> +#define CODE_FOR_loongarch_crc_w_h_w CODE_FOR_crc_w_h_w
> +#define CODE_FOR_loongarch_crc_w_w_w CODE_FOR_crc_w_w_w
> +#define CODE_FOR_loongarch_crc_w_d_w CODE_FOR_crc_w_d_w
> +#define CODE_FOR_loongarch_crcc_w_b_w CODE_FOR_crcc_w_b_w
> +#define CODE_FOR_loongarch_crcc_w_h_w CODE_FOR_crcc_w_h_w
> +#define CODE_FOR_loongarch_crcc_w_w_w CODE_FOR_crcc_w_w_w
> +#define CODE_FOR_loongarch_crcc_w_d_w CODE_FOR_crcc_w_d_w
> +
> +/* Privileged state instruction.  */
> +#define CODE_FOR_loongarch_cpucfg CODE_FOR_cpucfg
> +#define CODE_FOR_loongarch_asrtle_d CODE_FOR_asrtle_d
> +#define CODE_FOR_loongarch_asrtgt_d CODE_FOR_asrtgt_d
> +#define CODE_FOR_loongarch_csrrd CODE_FOR_csrrd
> +#define CODE_FOR_loongarch_dcsrrd CODE_FOR_dcsrrd
> +#define CODE_FOR_loongarch_csrwr CODE_FOR_csrwr
> +#define CODE_FOR_loongarch_dcsrwr CODE_FOR_dcsrwr
> +#define CODE_FOR_loongarch_csrxchg CODE_FOR_csrxchg
> +#define CODE_FOR_loongarch_dcsrxchg CODE_FOR_dcsrxchg
> +#define CODE_FOR_loongarch_iocsrrd_b CODE_FOR_iocsrrd_b
> +#define CODE_FOR_loongarch_iocsrrd_h CODE_FOR_iocsrrd_h
> +#define CODE_FOR_loongarch_iocsrrd_w CODE_FOR_iocsrrd_w
> +#define CODE_FOR_loongarch_iocsrrd_d CODE_FOR_iocsrrd_d
> +#define CODE_FOR_loongarch_iocsrwr_b CODE_FOR_iocsrwr_b
> +#define CODE_FOR_loongarch_iocsrwr_h CODE_FOR_iocsrwr_h
> +#define CODE_FOR_loongarch_iocsrwr_w CODE_FOR_iocsrwr_w
> +#define CODE_FOR_loongarch_iocsrwr_d CODE_FOR_iocsrwr_d
> +#define CODE_FOR_loongarch_lddir CODE_FOR_lddir
> +#define CODE_FOR_loongarch_dlddir CODE_FOR_dlddir
> +#define CODE_FOR_loongarch_ldpte CODE_FOR_ldpte
> +#define CODE_FOR_loongarch_dldpte CODE_FOR_dldpte
> +#define CODE_FOR_loongarch_cacop CODE_FOR_cacop
> +#define CODE_FOR_loongarch_dcacop CODE_FOR_dcacop
> +#define CODE_FOR_loongarch_dbar CODE_FOR_dbar
> +#define CODE_FOR_loongarch_ibar CODE_FOR_ibar

Unless there's a reason not to, it would be better to add “loongarch_”
to the names of the .md patterns instead.  That removes the need for
the list above, but it also reduces the risk of an accidental clash
with target-independent pattern names.

Looks good to me otherwise, thanks.

Richard

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 08/12] LoongArch Port: libgcc
  2022-03-04  7:18 ` [PATCH v8 08/12] LoongArch Port: libgcc xuchenghua
@ 2022-03-08 18:59   ` Richard Sandiford
  2022-03-08 19:37     ` Andreas Schwab
  0 siblings, 1 reply; 34+ messages in thread
From: Richard Sandiford @ 2022-03-08 18:59 UTC (permalink / raw)
  To: xuchenghua; +Cc: gcc-patches, chenglulu, joseph

xuchenghua@loongson.cn writes:
> diff --git a/libgcc/config/loongarch/crti.S b/libgcc/config/loongarch/crti.S
> new file mode 100644
> index 00000000000..27b7eab3626
> --- /dev/null
> +++ b/libgcc/config/loongarch/crti.S
> @@ -0,0 +1,43 @@
> +/* Copyright (C) 2021-2022 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +Under Section 7 of GPL version 3, you are granted additional
> +permissions described in the GCC Runtime Library Exception, version
> +3.1, as published by the Free Software Foundation.
> +
> +You should have received a copy of the GNU General Public License and
> +a copy of the GCC Runtime Library Exception along with this program;
> +see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +<http://www.gnu.org/licenses/>.  */
> +
> +/* 4 slots for argument spill area.  1 for cpreturn, 1 for stack.
> +   Return spill offset of 8.  Aligned to 16 bytes for lp64.  */

The comment doesn't apply to the Loongson code.  Probably best
to delete it.

Same for crtn.S.

> +
> +	.section .init,"ax",@progbits
> +	.globl	_init
> +	.type	_init,@function
> +_init:
> +	addi.d   $r3,$r3,-16
> +	st.d      $r1,$r3,8
> +	addi.d   $r3,$r3,16
> +	jirl	$r0,$r1,0
> +
> +	.section .fini,"ax",@progbits
> +	.globl	_fini
> +	.type	_fini,@function
> +_fini:
> +	addi.d   $r3,$r3,-16
> +	st.d      $r1,$r3,8
> +	addi.d   $r3,$r3,16
> +	jirl	$r0,$r1,0

Are you sure this is right?  It looks like it pushes LR and then
immediately pops it and returns, which would have the effect of
bypassing the rest of the .init and .fini code.

The idea instead is that .init starts with the code in crti.S,
then contains any .init code linked in from .o files, then ends
with the .init code in crtn.S.  Same for .fini.

Looks good to me otherwise.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 11/12] LoongArch Port: gcc/testsuite
  2022-03-04  7:18 ` [PATCH v8 11/12] LoongArch Port: gcc/testsuite xuchenghua
@ 2022-03-08 19:06   ` Richard Sandiford
  0 siblings, 0 replies; 34+ messages in thread
From: Richard Sandiford @ 2022-03-08 19:06 UTC (permalink / raw)
  To: xuchenghua; +Cc: gcc-patches, chenglulu, joseph

xuchenghua@loongson.cn writes:
> diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
> index 737e1a8913b..843b508b010 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -286,6 +286,10 @@ proc check_configured_with { pattern } {
>  proc check_weak_available { } {
>      global target_cpu
>  
> +    if { [ string first "loongarch" $target_cpu ] >= 0 } {
> +        return 1
> +    }
> +
>      # All mips targets should support it
>  
>      if { [ string first "mips" $target_cpu ] >= 0 } {

For modern targets, the procedure ought to give the right answer without
this change.  I'm not sure off-hand which MIPS target required the
special case, but it's probably not one we support any more.

It would be good to have tests in gcc.target/loongarch that cover
all of the intrinsics defined in larchintrin.h.

Looks good to me otherwise, thanks.

Richard

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 08/12] LoongArch Port: libgcc
  2022-03-08 18:59   ` Richard Sandiford
@ 2022-03-08 19:37     ` Andreas Schwab
  0 siblings, 0 replies; 34+ messages in thread
From: Andreas Schwab @ 2022-03-08 19:37 UTC (permalink / raw)
  To: xuchenghua; +Cc: gcc-patches, chenglulu, joseph, richard.sandiford

On Mär 08 2022, Richard Sandiford via Gcc-patches wrote:

>> +
>> +	.section .init,"ax",@progbits
>> +	.globl	_init
>> +	.type	_init,@function
>> +_init:
>> +	addi.d   $r3,$r3,-16
>> +	st.d      $r1,$r3,8
>> +	addi.d   $r3,$r3,16
>> +	jirl	$r0,$r1,0
>> +
>> +	.section .fini,"ax",@progbits
>> +	.globl	_fini
>> +	.type	_fini,@function
>> +_fini:
>> +	addi.d   $r3,$r3,-16
>> +	st.d      $r1,$r3,8
>> +	addi.d   $r3,$r3,16
>> +	jirl	$r0,$r1,0
>
> Are you sure this is right?  It looks like it pushes LR and then
> immediately pops it and returns, which would have the effect of
> bypassing the rest of the .init and .fini code.
>
> The idea instead is that .init starts with the code in crti.S,
> then contains any .init code linked in from .o files, then ends
> with the .init code in crtn.S.  Same for .fini.

New architectures should not use .init/.fini at all.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 12/12] LoongArch Port: Add doc.
  2022-03-04  7:18 ` [PATCH v8 12/12] LoongArch Port: Add doc xuchenghua
@ 2022-03-08 19:53   ` Richard Sandiford
  0 siblings, 0 replies; 34+ messages in thread
From: Richard Sandiford @ 2022-03-08 19:53 UTC (permalink / raw)
  To: xuchenghua; +Cc: gcc-patches, chenglulu, joseph

xuchenghua@loongson.cn writes:
> From: chenglulu <chenglulu@loongson.cn>
>
> 2022-03-04  Chenghua Xu  <xuchenghua@loongson.cn>
> 	    Lulu Cheng  <chenglulu@loongson.cn>
>
> 	* contrib/config-list.mk: Add LoongArch triplet.
> 	* gcc/doc/install.texi: Add LoongArch options section.
> 	* gcc/doc/invoke.texi: Add LoongArch options section.
> 	* gcc/doc/md.texi: Add LoongArch options section.
> ---
>  contrib/config-list.mk |   5 +-
>  gcc/doc/install.texi   |  47 +++++++++-
>  gcc/doc/invoke.texi    | 202 +++++++++++++++++++++++++++++++++++++++++
>  gcc/doc/md.texi        |  55 +++++++++++
>  4 files changed, 303 insertions(+), 6 deletions(-)
>
> diff --git a/contrib/config-list.mk b/contrib/config-list.mk
> index 3e1d1321861..ba6f12e4693 100644
> --- a/contrib/config-list.mk
> +++ b/contrib/config-list.mk
> @@ -57,7 +57,10 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
>    i686-wrs-vxworksae \
>    i686-cygwinOPT-enable-threads=yes i686-mingw32crt ia64-elf \
>    ia64-freebsd6 ia64-linux ia64-hpux ia64-hp-vms iq2000-elf lm32-elf \
> -  lm32-rtems lm32-uclinux m32c-rtems m32c-elf m32r-elf m32rle-elf \
> +  lm32-rtems lm32-uclinux \
> +  loongarch64-linux-gnu loongarch64-linux-gnuf64 \
> +  loongarch64-linux-gnuf32 loongarch64-linux-gnusf \

If I've understood correctly, loongarch64-linux-gnu defaults to
the same ABI as loongarch64-linux-gnuf64, is that right?  If so,
it's probably worth dropping one of them from this list to reduce
duplication.  In other words, it feels like there should just be
3 entries here rather than 4.

> […]
> @@ -1254,6 +1255,14 @@ profile.  The union of these options is considered when specifying both
>  @code{-mfloat-abi=hard}
>  @end multitable
>  
> +@item loongarch*-*-*
> +@var{list} is a comma-separated list of the following ABI identifiers:
> +@code{lp64d[/base]} @code{lp64f[/base]} @code{lp64d[/base]}, where the
> +@code{/base} suffix may be omitted, to enable their respective run-time
> +libraries.  If @var{list} is empty, @code{default}
> +or @option{--with-multilib-list} is not specified, then the default ABI

Maybe clearer as:

  If @var{list} is empty or @code{default}, or if
  @option{--with-multilib-list} is not specified, […]

> +as specified by @option{--with-abi} or implied by @option{--target} is selected.
> +
>  @item riscv*-*-*
>  @var{list} is a single ABI name.  The target architecture must be either
>  @code{rv32gc} or @code{rv64gc}.  This will build a single multilib for the
> […]
> @@ -995,6 +995,16 @@ Objective-C and Objective-C++ Dialects}.
>  @gccoptlist{-mbarrel-shift-enabled  -mdivide-enabled  -mmultiply-enabled @gol
>  -msign-extend-enabled  -muser-enabled}
>  
> +@emph{LoongArch Options}
> +@gccoptlist{-march=@var{cpu-type}  -mtune=@var{cpu-type} -mabi=@var{base-abi-type} @gol
> +-mfpu=@var{fpu-type} -msoft-float -msingle-float -mdouble-float @gol
> +-mbranch-cost=@var{n}  -mcheck-zero-division -mno-check-zero-division @gol
> +-mcond-move-int  -mno-cond-move-int @gol
> +-mcond-move-float  -mno-cond-move-float @gol
> +-memcpy  -mno-memcpy -mstrict-align -mno-strict-align @gol
> +-mmax-inline-memcpy-size=@var{n} @gol
> +-mlra -mcmodel=@var{code-model}}

Following on from earlier comments, please remove -mlra :-)
(Or more specifically, -mno-lra.)

> +
>  @emph{M32R/D Options}
>  @gccoptlist{-m32r2  -m32rx  -m32r @gol
>  -mdebug @gol
> @@ -18863,6 +18873,7 @@ platform.
>  * HPPA Options::
>  * IA-64 Options::
>  * LM32 Options::
> +* LoongArch Options::
>  * M32C Options::
>  * M32R/D Options::
>  * M680x0 Options::
> @@ -24378,6 +24389,197 @@ Enable user-defined instructions.
>  
>  @end table
>  
> +@node LoongArch Options
> +@subsection LoongArch Options
> +@cindex LoongArch Options
> +
> +These command-line options are defined for LoongArch targets:
> +
> +@table @gcctabopt
> +@item -march=@var{cpu-type}
> +@opindex -march
> +Generate instructions for the machine type @var{cpu-type}.  In contrast to
> +@option{-mtune=@var{cpu-type}}, which merely tunes the generated code
> +for the specified @var{cpu-type}, @option{-march=@var{cpu-type}} allows GCC
> +to generate code that may not run at all on processors other than the one
> +indicated.  Specifying @option{-march=@var{cpu-type}} implies
> +@option{-mtune=@var{cpu-type}}, except where noted otherwise.
> +
> +The choices for @var{cpu-type} are:
> +
> +@table @samp
> +@item native
> +This selects the CPU to generate code for at compilation time by determining
> +the processor type of the compiling machine.  Using @option{-march=native}
> +enables all instruction subsets supported by the local machine (hence
> +the result might not run on different machines).  Using @option{-mtune=native}
> +produces code optimized for the local machine under the constraints
> +of the selected instruction set.
> +@item loongarch64
> +A generic CPU with 64-bit extensions.
> +@item la464
> +LoongArch LA464 CPU with LBT, LSX, LASX, LVZ.
> +@end table
> +
> +
> +@item -mtune=@var{cpu-type}
> +@opindex mtune
> +Optimize the output for the given processor, specified by microarchitecture
> +name.
> +
> +@item -mabi=@var{base-abi-type}
> +@opindex mabi
> +Generate code for the specified calling convention. @gol
> +Set base ABI to one of: @gol

Using @gol looks odd here.  Maybe:

s/Set base ABI to one of/@var{base-abi-type} can be one of:/

> +@table @samp
> +@item lp64d
> +Uses 64-bit general purpose registers and 32/64-bit floating-point
> +registers for parameter passing.  Data model is LP64, where int
> +is 32 bits, while long int and pointers are 64 bits.

@samp{int} and @samp{long int}.  Same for the others.

> +@item lp64f
> +Uses 64-bit general purpose registers and 32-bit floating-point
> +registers for parameter passing.  Data model is LP64, where int
> +is 32 bits, while long int and pointers are 64 bits.
> +@item lp64s
> +Uses 64-bit general purpose registers and no floating-point
> +registers for parameter passing.  Data model is LP64, where int
> +is 32 bits, while long int and pointers are 64 bits.
> +@end table
> +
> +

Very minor, sorry, but it's probably more consistent to have only
a single blank line, even though I realise you're using double
blank lines to separate options into related groups.  Same for
the rest of this section.

> +@item -mfpu=@var{fpu-type}
> +@opindex mfpu
> +Generating code for the specified FPU type: @gol

Maybe:

  Generate code for the specified FPU type, which can be one of:

(likewise without @gol)

> +@table @samp
> +@item 64
> +Allow the use of hardware floating-point instructions for 32-bit
> +and 64-bit operations.
> +@item 32
> +Allow the use of hardware floating-point instructions for 32-bit
> +operations.
> +@item none
> +@item 0

Going back to the review for part 2: is 0 really supported?

> +Prevent the use of hardware floating-point instructions.
> +@end table
> +
> +
> +@item -msoft-float
> +@opindex msoft-float
> +Force @option{-mfpu=none} and prevents the use of floating-point
> +registers for parameter passing.  This option may change the target
> +ABI.
> +
> +@item -msingle-float
> +@opindex -msingle-float
> +Force @option{-mfpu=32} and allow the use of 32-bit floating-point
> +registers for parameter passing.  This option may change the target
> +ABI.
> +
> +@item -mdouble-float
> +@opindex -mdouble-float
> +Force @option{-mfpu=64} and allow the use of 32/64-bit floating-point
> +registers for parameter passing.  This option may change the target
> +ABI.
> +
> +
> +@item -mbranch-cost=@var{n}
> +@opindex -mbranch-cost
> +Set the cost of branches to roughly n instructions.

Markup nit: @var{n}

> +
> +@item -mcheck-zero-division
> +@itemx -mno-check-zero-divison
> +@opindex -mcheck-zero-division
> +Trap (do not trap) on integer division by zero. The default is '-mcheck-zero-
> +division'.

Markup nit: @option{-mcheck-zero-division}

Same for later options.  The option names shouldn't be split across lines.

> +
> +
> +@item -mcond-move-int
> +@itemx -mno-cond-move-int
> +@opindex -mcond-move-int
> +Conditional moves for integer are enabled (disabled). The default is
> +'-mcond-move-float'.

Looks like there's something missing here: there should be a listing
for both the int and the float options.  “integer registers” (and
“floating-point registers”) might be clearer than just “integer”.

> +
> +@item -mmemcpy
> +@itemx -mno-memcpy
> +@opindex -mmemcpy
> +Force (do not force) the use of memcpy for non-trivial block moves. The default
> +is '-mno-memcpy', which allows GCC to inline most constant-sized copies.

Following on from the earlier review: this should mention that the
option is affected by -Os, if that is the intention.

> +
> +
> +@item -mlra
> +@opindex -mlra
> +Use the new LRA register allocator. By default, the LRA is used.
> +
> +@item -mstrict-align
> +@itemx -mno-strict-align
> +@opindex -mstrict-align
> +Avoid or allow generating memory accesses that may not be aligned on a natural
> +object boundary as described in the architecture specification. The default is
> +'-mno-strict-align'.
> +
> +@item -msmall-data-limit=@var{number}
> +@opindex -msmall-data-limit
> +Put global and static data smaller than @code{number} bytes into a special section (on some targets).

@var{number}

Also (very minor, sorry): the line is longer than the 80 character limit.
Same for some later lines (but not many).

> +Default value is 0.
> +
> +
> +@item -mmax-inline-memcpy-size=@var{n}
> +@opindex -mmax-inline-memcpy-size
> +Set the max size n of memcpy to inline, default n is 1024.

Borrowing form PowerPC, maybe:

  Inline all block moves (such as calls to @code{memcpy} or structure
  copies) less than or equal to @var{n} bytes.  The default value of
  @var{n} is 1024.

This makes it clearer that structure copies are included too.

> +
> +@item -mcmodel=@var{code-model}
> +Default code model is normal.

I think it's clearer to put the default after the table instead,
after “normal” has been introduced.  Markup-wise it should be:

  The default code model is @code{normal}.

> […]
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index f3619c505c0..98c2bd67a80 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -2747,6 +2747,61 @@ Memory addressed using the small base register ($sb).
>  $r1h
>  @end table
>  
> +@item LoongArch---@file{config/loongarch/constraints.md}
> +@table @code
> +@item a
> +A constant call global and noplt address.
> +@item c
> +A constant call local address.
> +@item e
> +A register that is used as function call.
> +@item f
> +A floating-point register (if available).
> +@item h
> +A constant call plt address.
> +@item j
> +A rester that is used as sibing call.

Typo: register

> +@item k
> +A memory operand whose address is formed by a base register and
> +(optionally scaled) index register.
> +@item l
> +A signed 16-bit constant.
> +@item m
> +A memory operand whose address is formed by a base register and offset
> +that is suitable for use in instructions with the same addressing mode
> +as @code{st.w} and @code{ld.w}.
> +@item q
> +A general-purpose register except for $r0 and $r1 for csr instructions.
> +@item t
> +A constant call weak address.
> +@item u
> +A signed 52bit constant and low 32-bit is zero (for logic instructions).
> +@item v
> +A nsigned 64-bit constant and low 44-bit is zero (for logic instructions).
> +@item w
> +Matches any valid memory.
> +@item z
> +A floating-point condition code register.
> +@item G
> +Floating-point zero.
> +@item I
> +A signed 12-bit constant (for arithmetic instructions).
> +@item J
> +Integer zero.
> +@item K
> +An unsigned 12-bit constant (for logic instructions).
> +@item Yd
> +A constant @code{move_operand} that can be safely loaded using
> +@code{la}.
> +@item ZB
> +An address that is held in a general-purpose register.
> +The offset is zero.
> +@item ZC
> +A memory operand whose address is formed by a base register and offset
> +that is suitable for use in instructions with the same addressing mode
> +as @code{ll.w} and @code{sc.w}.
> +@end table
> +

Are you sure you want to make all of these public?  It doesn't look like
they would all be useful to inline assembly programmers, and once a
constraint is public, it can't easily be changed or removed without
breaking backwards compatibility.

“What linux needs” is probably a good starting point :-)

Just a question/suggestion though, no need to change.

Looks good to me otherwise.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 00/12] Add LoongArch support.
  2022-03-04 20:02 ` [PATCH v8 00/12] Add LoongArch support Xi Ruoyao
  2022-03-05  1:36   ` Paul Hua
@ 2022-03-08 19:57   ` Richard Sandiford
  1 sibling, 0 replies; 34+ messages in thread
From: Richard Sandiford @ 2022-03-08 19:57 UTC (permalink / raw)
  To: Xi Ruoyao via Gcc-patches; +Cc: xuchenghua, Xi Ruoyao, chenglulu, joseph

Xi Ruoyao via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Fri, 2022-03-04 at 15:17 +0800, xuchenghua@loongson.cn wrote:
>
>> The binutils has been merged into trunk:
>> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=560b3fe208255ae909b4b1c88ba9c28b09043307
>> 
>> Note: We split -mabi= into -mabi=lp64d/f/s, the new options not support by upstream binutils yet,
>> this GCC port requires the following patch applied to binutils to build.
>> https://github.com/loongson/binutils-gdb/commit/aacb0bf860f02aa5a7dcb76dd0e392bf871c7586
>> (will be submitted to upstream after gcc side comfirmed)
>
> I think you don't need a review for binutils change here.  You should
> get it reviewed and applied in binutils-gdb ASAP.  Then in install.texi
> you would add a note like "loongarch64-*-* requires binutils >= 2.39" in
> "Target specific installation notes", as an unpatched 2.38 does not
> work.
>
> And based on the history of RISC-V port
> (https://gcc.gnu.org/pipermail/gcc/2017-January/222595.html) the process
> for a new port seems:
>
> 1. Get a permission from the Steering Committee.
> 2. Add one or two port maintainers into MAINTAINERS file.
> 3. Now the technical reviewing of the patch series just begin.
>
>
> I'm not an expert in software engineering (or social interaction :) and
> I don't know if the process has been changed in these years.

I'm not sure either, but yeah, this is what I understood the process to be.

On the technical side: I've gone through the series and sent comments
about some of the patches.  The ones I didn't reply to looked good as-is.

Generally the series looks in very good shape to me FWIW.  There are no
target-independent changes, so I agree there's no reason to delay the
patches until GCC 13.  If for some reason they don't go in before
GCC 12.1, they would be safe to backport to GCC 12.2.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 04/12] LoongArch Port: Machine description files.
  2022-03-07 20:15   ` Xi Ruoyao
@ 2022-03-10  6:26     ` 程璐璐
  0 siblings, 0 replies; 34+ messages in thread
From: 程璐璐 @ 2022-03-10  6:26 UTC (permalink / raw)
  To: Xi Ruoyao, xuchenghua, gcc-patches; +Cc: joseph

Hi,

    We are modifying the code, this support will be

added in the next commit.

Thanks.

在 2022/3/8 上午4:15, Xi Ruoyao 写道:
> On Fri, 2022-03-04 at 15:18 +0800, xuchenghua@loongson.cn wrote:
>
>>          * config/loongarch/loongarch.md: New file.
> An ICE happens building OpenSSH-8.9p1.  Investigation shows it's caused
> by the flag "-fzero-call-used-regs=".  It's because the compiler
> attempts to clear FCCx registers but can't figure out how.
>
> This flag also triggers ICE for other targets (for example, PR 104820
> for MIPS), and the related tests (zero-scratch-regs-{8,9,10,11}.c) are
> marked dg-skip for many targets.
>
> But it's unfortunate that packages like OpenSSH have already start to
> use this flag... I guess they just enabled it once they saw it was
> working for i386 :(.  So it's better to solve the problem for a new
> target.
>
> A "quick fix" is adding an insn to clear FCCx.  This is enough to build
> OpenSSH and make zero-scratch-regs-{8,9,10,11}.c PASS.
>
> diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
> index a9a8bc4b038..76c5ded9fe4 100644
> --- a/gcc/config/loongarch/loongarch.md
> +++ b/gcc/config/loongarch/loongarch.md
> @@ -2020,6 +2020,12 @@
>     DONE;
>   })
>   
> +;; Clear one FCC register
> +(define_insn "movfcc" [(set (match_operand:FCC 0 "register_operand" "=z")
> +                     (const_int 0))]
> +  ""
> +  "movgr2cf\t%0,$r0")
> +
>   ;; Conditional move instructions.
>   
>   (define_insn "*sel<code><GPR:mode>_using_<GPR2:mode>"

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 00/12] Add LoongArch support.
  2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
                   ` (12 preceding siblings ...)
  2022-03-04 20:02 ` [PATCH v8 00/12] Add LoongArch support Xi Ruoyao
@ 2022-03-15 14:35 ` Xi Ruoyao
  2022-03-16 12:31   ` 程璐璐
  13 siblings, 1 reply; 34+ messages in thread
From: Xi Ruoyao @ 2022-03-15 14:35 UTC (permalink / raw)
  To: xuchenghua, gcc-patches; +Cc: chenglulu, joseph, Huacai Chen

On Fri, 2022-03-04 at 15:17 +0800, xuchenghua@loongson.cn wrote:

> v7 -> v8
> 1. Add new addressing type ADDRESS_REG_REG support.
> 2. Modify documentation.
> 3. Eliminate compile-time warnings.

Hi,

The v8 series does not build LoongArch Linux kernel tree
(https://github.com/loongson/linux, loongarch-next branch) successfully.
This is a regression: the v7 series built the kernel fine.

A testcase reduced from the __get_data_asm macro in uaccess.h:

$ cat t1.c
char *ptr;
int offset;
struct m
{
  char a[2];
};

char
x (void)
{
  char t;
  asm volatile("ld.b %0, %1" : "=r"(t) : "o"(*(struct m *)(ptr + offset)));
  return t;
}

$ ./gcc/cc1 t1.c -nostdinc -O
t1.c: In function ‘x’:
t1.c:12:3: error: impossible constraint in ‘asm’
   12 |   asm volatile("ld.b %0, %1" : "=r"(t) : "o"(*(struct m *)(ptr + offset)));
      |   ^~~

It seems changing the constraint "o" to "m" can work around this issue.
I'm not sure if this is a compiler bug or a kernel bug.
-- 
Xi Ruoyao <xry111@mengyan1223.wang>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 00/12] Add LoongArch support.
  2022-03-15 14:35 ` Xi Ruoyao
@ 2022-03-16 12:31   ` 程璐璐
  0 siblings, 0 replies; 34+ messages in thread
From: 程璐璐 @ 2022-03-16 12:31 UTC (permalink / raw)
  To: Xi Ruoyao, xuchenghua, gcc-patches; +Cc: joseph, Huacai Chen


在 2022/3/15 下午10:35, Xi Ruoyao 写道:
> On Fri, 2022-03-04 at 15:17 +0800, xuchenghua@loongson.cn wrote:
>
>> v7 -> v8
>> 1. Add new addressing type ADDRESS_REG_REG support.
>> 2. Modify documentation.
>> 3. Eliminate compile-time warnings.
> Hi,
>
> The v8 series does not build LoongArch Linux kernel tree
> (https://github.com/loongson/linux, loongarch-next branch) successfully.
> This is a regression: the v7 series built the kernel fine.
>
> A testcase reduced from the __get_data_asm macro in uaccess.h:
>
> $ cat t1.c
> char *ptr;
> int offset;
> struct m
> {
>    char a[2];
> };
>
> char
> x (void)
> {
>    char t;
>    asm volatile("ld.b %0, %1" : "=r"(t) : "o"(*(struct m *)(ptr + offset)));
>    return t;
> }
>
> $ ./gcc/cc1 t1.c -nostdinc -O
> t1.c: In function ‘x’:
> t1.c:12:3: error: impossible constraint in ‘asm’
>     12 |   asm volatile("ld.b %0, %1" : "=r"(t) : "o"(*(struct m *)(ptr + offset)));
>        |   ^~~
>
> It seems changing the constraint "o" to "m" can work around this issue.
> I'm not sure if this is a compiler bug or a kernel bug.

Hi,

LoongArch supports memory modes as follows:

          mode                    memory constraint

1. base  + index "k"

2. base + imm12 "m"

3. base + imm16 (immediate 4-byte alignment) "ZC"

4. base + 0 "ZB"

so, this constraint should be "m".




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 05/12] LoongArch Port: Machine description C files and .h files.
  2022-03-07 18:17   ` Richard Sandiford
@ 2022-03-19  9:35     ` 程璐璐
  0 siblings, 0 replies; 34+ messages in thread
From: 程璐璐 @ 2022-03-19  9:35 UTC (permalink / raw)
  To: xuchenghua, gcc-patches, joseph, richard.sandiford


在 2022/3/8 上午2:17, Richard Sandiford 写道:
>> […]
>> +/* This definition replaces the formerly used 'm' constraint with a
>> +   different constraint letter in order to avoid changing semantics of
>> +   the 'm' constraint when accepting new address formats in
>> +   TARGET_LEGITIMATE_ADDRESS_P.  The constraint letter defined here
>> +   must not be used in insn definitions or inline assemblies.  */
>> +#define TARGET_MEM_CONSTRAINT 'w'
> Do you need to do this for a new port like Loongson?  It looks like
> TARGET_LEGITIMATE_ADDRESS_P accepts all valid forms of address,
> including reg+reg (good!), so shouldn't "m" accept them too?
>
> If there is already existing code that assumes that "m" is never indexed
> then the definition obviously makes sense.  But if you don't know of any
> such code then it would be better to make "m" accept all the things that
> TARGET_LEGITIMATE_ADDRESS_P does (by removing this definition).
>
Hi, Richard:

LoongArch supports memory modes as follows:

          mode

1. base  + index

2. base + imm12

3. base + imm16 (immediate 4-byte alignment)

4. base + 0

Because 'base + imm12' is most case, so we defined it to 'm'.


Thanks.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v8 04/12] LoongArch Port: Machine description files.
  2022-03-06 16:16   ` Richard Sandiford
  2022-03-07  3:59     ` 程璐璐
@ 2022-03-19  9:40     ` 程璐璐
  1 sibling, 0 replies; 34+ messages in thread
From: 程璐璐 @ 2022-03-19  9:40 UTC (permalink / raw)
  To: xuchenghua, gcc-patches, joseph, richard.sandiford


在 2022/3/7 上午12:16, Richard Sandiford 写道:
>> +(define_split
>> +  [(match_operand 0 "small_data_pattern")]
>> +  "reload_completed"
>> +  [(match_dup 0)]
>> +  { operands[0] = loongarch_rewrite_small_data (operands[0]); })
>> +
>> +
>> +;; Match paired HI/SI/SF/DFmode load/stores.
>> +(define_insn "*join2_load_store<JOIN_MODE:mode>"
>> +  [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand"
>> +  "=r,f,m,m,r,ZC,r,k,f,k")
>> +	(match_operand:JOIN_MODE 1 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))
>> +   (set (match_operand:JOIN_MODE 2 "nonimmediate_operand"
>> +   "=r,f,m,m,r,ZC,r,k,f,k")
>> +	(match_operand:JOIN_MODE 3 "nonimmediate_operand" "m,m,r,f,ZC,r,k,r,k,f"))]
>> +  "reload_completed"
>> +  {
>> +    bool load_p = (which_alternative == 0 || which_alternative == 1);
>> +    /* Reg-renaming pass reuses base register if it is dead after bonded loads.
>> +       Hardware does not bond those loads, even when they are consecutive.
>> +       However, order of the loads need to be checked for correctness.  */
>> +    if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1]))
>> +      {
> I'm not sure I understand how these patterns work, but it looks like the
> condition above is trying to work around a later change to the insn by
> regrename, after peephole2 has checked loongarch_load_store_bonding_p.
> If so, you should be able to avoid that by marking the destinations of
> the loads as earlyclobbers, using "&r" instead of "r" for the first
> alternative.  regrename should then preserve the conditions that
> loongarch_load_store_bonding_p checked earlier.
>
> Same for the other patterns.
>
Hi,

I think peephole pass is after reload pass, so peephole pass don't need '&'.


Thanks.




^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2022-03-19  9:40 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-04  7:17 [PATCH v8 00/12] Add LoongArch support xuchenghua
2022-03-04  7:17 ` [PATCH v8 01/12] LoongArch Port: Regenerate configure xuchenghua
2022-03-04  7:17 ` [PATCH v8 02/12] LoongArch Port: gcc build xuchenghua
2022-03-06 10:29   ` Richard Sandiford
2022-03-06 10:34     ` Andreas Schwab
2022-03-04  7:18 ` [PATCH v8 03/12] LoongArch Port: Regenerate gcc/configure xuchenghua
2022-03-04  7:18 ` [PATCH v8 04/12] LoongArch Port: Machine description files xuchenghua
2022-03-06 16:16   ` Richard Sandiford
2022-03-07  3:59     ` 程璐璐
2022-03-19  9:40     ` 程璐璐
2022-03-07 20:15   ` Xi Ruoyao
2022-03-10  6:26     ` 程璐璐
2022-03-04  7:18 ` [PATCH v8 05/12] LoongArch Port: Machine description C files and .h files xuchenghua
2022-03-07 18:17   ` Richard Sandiford
2022-03-19  9:35     ` 程璐璐
2022-03-04  7:18 ` [PATCH v8 06/12] LoongArch Port: Builtin functions xuchenghua
2022-03-08 18:43   ` Richard Sandiford
2022-03-04  7:18 ` [PATCH v8 07/12] LoongArch Port: Builtin macros xuchenghua
2022-03-04  7:18 ` [PATCH v8 08/12] LoongArch Port: libgcc xuchenghua
2022-03-08 18:59   ` Richard Sandiford
2022-03-08 19:37     ` Andreas Schwab
2022-03-04  7:18 ` [PATCH v8 09/12] LoongArch Port: Regenerate libgcc/configure xuchenghua
2022-03-04  7:18 ` [PATCH v8 10/12] LoongArch Port: libgomp xuchenghua
2022-03-04  7:18 ` [PATCH v8 11/12] LoongArch Port: gcc/testsuite xuchenghua
2022-03-08 19:06   ` Richard Sandiford
2022-03-04  7:18 ` [PATCH v8 12/12] LoongArch Port: Add doc xuchenghua
2022-03-08 19:53   ` Richard Sandiford
2022-03-04 20:02 ` [PATCH v8 00/12] Add LoongArch support Xi Ruoyao
2022-03-05  1:36   ` Paul Hua
2022-03-05  1:39     ` Paul Hua
2022-03-05  9:21     ` Xi Ruoyao
2022-03-08 19:57   ` Richard Sandiford
2022-03-15 14:35 ` Xi Ruoyao
2022-03-16 12:31   ` 程璐璐

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).