public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion
@ 2020-12-04 19:19 acsawdey
  2020-12-07 20:48 ` will schmidt
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: acsawdey @ 2020-12-04 19:19 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, wschmidt, Aaron Sawdey

From: Aaron Sawdey <acsawdey@linux.ibm.com>

This patch adds the first batch of patterns to support p10 fusion. These
will allow combine to create a single insn for a pair of instructions
that that power10 can fuse and execute. These particular ones have the
requirement that only cr0 can be used when fusing a load with a compare
immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine
to put that requirement in, and if it doesn't work out later the splitter
can get used.

The patterns are generated by a script genfusion.pl and live in new file
fusion.md. This script will be expanded to generate more patterns for
fusion.

This also adds option -mpower10-fusion which defaults on for power10 and
will gate all these fusion patterns. In addition I have added an
undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
that just controls the load+compare-immediate patterns. I have make
these default on for power10 but they are not disallowed for earlier
processors because it is still valid code. This allows us to test the
correctness of fusion code generation by turning it on explicitly.

If bootstrap/regtest is clean, ok for trunk?

Thanks!

   Aaron

gcc/ChangeLog:

	* config/rs6000/genfusion.pl: New file, script to generate
	define_insn_and_split patterns so combine can arrange fused
	instructions next to each other.
	* config/rs6000/fusion.md: New file, generated fused instruction
	patterns for combine.
	* config/rs6000/predicates.md (const_m1_to_1_operand): New predicate.
	(non_update_memory_operand): New predicate.
	* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
	OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and
	POWERPC_MASKS.
	* config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add
	prototype.
	* config/rs6000/rs6000.c (rs6000_option_override_internal):
	automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
 	if target is power10.  (rs600_opt_masks): Allow -mpower10-fusion
	in function attributes.  (address_is_non_pfx_d_or_x): New function.
	* config/rs6000/rs6000.h: Add MASK_P10_FUSION.
	* config/rs6000/rs6000.md: Include fusion.md.
	* config/rs6000/rs6000.opt: Add -mpower10-fusion
	and -mpower10-fusion-ld-cmpi.
	* config/rs6000/t-rs6000: Add dependencies involving fusion.md.
---
 gcc/config/rs6000/fusion.md       | 357 ++++++++++++++++++++++++++++++
 gcc/config/rs6000/genfusion.pl    | 144 ++++++++++++
 gcc/config/rs6000/predicates.md   |  14 ++
 gcc/config/rs6000/rs6000-cpus.def |   6 +-
 gcc/config/rs6000/rs6000-protos.h |   2 +
 gcc/config/rs6000/rs6000.c        |  51 +++++
 gcc/config/rs6000/rs6000.h        |   1 +
 gcc/config/rs6000/rs6000.md       |   1 +
 gcc/config/rs6000/rs6000.opt      |   8 +
 gcc/config/rs6000/t-rs6000        |   6 +-
 10 files changed, 588 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/rs6000/fusion.md
 create mode 100755 gcc/config/rs6000/genfusion.pl

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
new file mode 100644
index 00000000000..a4d3a6ae7f3
--- /dev/null
+++ b/gcc/config/rs6000/fusion.md
@@ -0,0 +1,357 @@
+;; -*- buffer-read-only: t -*-
+;; Generated automatically by genfusion.pl
+
+;; Copyright (C) 2020 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it under
+;; the terms of the GNU General Public License as published by the Free
+;; Software Foundation; either version 3, or (at your option) any later
+;; version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
+;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+;; for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is DI result mode is clobber compare mode is CC extend is none
+(define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
+  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
+        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
+                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))
+   (clobber (match_scratch:DI 0 "=r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 2)
+        (compare:CC (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
+(define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
+  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
+        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
+                 (match_operand:DI 3 "const_0_to_1_operand" "n")))
+   (clobber (match_scratch:DI 0 "=r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 2)
+        (compare:CCUNS (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is DI result mode is DI compare mode is CC extend is none
+(define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
+  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
+        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
+                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))
+   (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 2)
+        (compare:CC (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is DI result mode is DI compare mode is CCUNS extend is none
+(define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
+  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
+        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
+                 (match_operand:DI 3 "const_0_to_1_operand" "n")))
+   (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 2)
+        (compare:CCUNS (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is SI result mode is clobber compare mode is CC extend is none
+(define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
+  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
+        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
+                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
+   (clobber (match_scratch:SI 0 "=r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 2)
+        (compare:CC (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is SI result mode is clobber compare mode is CCUNS extend is none
+(define_insn_and_split "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none"
+  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
+        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
+                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
+   (clobber (match_scratch:SI 0 "=r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 2)
+        (compare:CCUNS (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is SI result mode is SI compare mode is CC extend is none
+(define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
+  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
+        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
+                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
+   (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 2)
+        (compare:CC (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is SI result mode is SI compare mode is CCUNS extend is none
+(define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none"
+  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
+        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
+                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
+   (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
+  [(set (match_dup 0) (match_dup 1))
+   (set (match_dup 2)
+        (compare:CCUNS (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
+(define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
+  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
+        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
+                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
+   (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
+  [(set (match_dup 0) (sign_extend:EXTSI (match_dup 1)))
+   (set (match_dup 2)
+        (compare:CC (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is SI result mode is EXTSI compare mode is CCUNS extend is zero
+(define_insn_and_split "*lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"
+  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
+        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
+                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
+   (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (zero_extend:EXTSI (match_dup 1)))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
+  [(set (match_dup 0) (zero_extend:EXTSI (match_dup 1)))
+   (set (match_dup 2)
+        (compare:CCUNS (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is HI result mode is clobber compare mode is CC extend is sign
+(define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign"
+  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
+        (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m")
+                 (match_operand:HI 3 "const_m1_to_1_operand" "n")))
+   (clobber (match_scratch:GPR 0 "=r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lha%X1 %0,%1\;cmpdi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
+  [(set (match_dup 0) (sign_extend:GPR (match_dup 1)))
+   (set (match_dup 2)
+        (compare:CC (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is HI result mode is clobber compare mode is CCUNS extend is zero
+(define_insn_and_split "*lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"
+  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
+        (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m")
+                 (match_operand:HI 3 "const_0_to_1_operand" "n")))
+   (clobber (match_scratch:GPR 0 "=r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lhz%X1 %0,%1\;cmpldi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
+  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
+   (set (match_dup 2)
+        (compare:CCUNS (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is HI result mode is EXTHI compare mode is CC extend is sign
+(define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign"
+  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
+        (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m")
+                 (match_operand:HI 3 "const_m1_to_1_operand" "n")))
+   (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (sign_extend:EXTHI (match_dup 1)))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lha%X1 %0,%1\;cmpdi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
+  [(set (match_dup 0) (sign_extend:EXTHI (match_dup 1)))
+   (set (match_dup 2)
+        (compare:CC (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is HI result mode is EXTHI compare mode is CCUNS extend is zero
+(define_insn_and_split "*lhz_cmpldi_cr0_HI_EXTHI_CCUNS_zero"
+  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
+        (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m")
+                 (match_operand:HI 3 "const_0_to_1_operand" "n")))
+   (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (zero_extend:EXTHI (match_dup 1)))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lhz%X1 %0,%1\;cmpldi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
+  [(set (match_dup 0) (zero_extend:EXTHI (match_dup 1)))
+   (set (match_dup 2)
+        (compare:CCUNS (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is QI result mode is clobber compare mode is CCUNS extend is zero
+(define_insn_and_split "*lbz_cmpldi_cr0_QI_clobber_CCUNS_zero"
+  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
+        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
+                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
+   (clobber (match_scratch:GPR 0 "=r"))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, NON_PREFIXED_D))"
+  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
+   (set (match_dup 2)
+        (compare:CCUNS (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
+;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
+;; load mode is QI result mode is GPR compare mode is CCUNS extend is zero
+(define_insn_and_split "*lbz_cmpldi_cr0_QI_GPR_CCUNS_zero"
+  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
+        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
+                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
+   (set (match_operand:GPR 0 "gpc_reg_operand" "=r") (zero_extend:GPR (match_dup 1)))]
+  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
+  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"
+  "&& reload_completed
+   && (cc_reg_not_cr0_operand (operands[2], CCmode)
+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, NON_PREFIXED_D))"
+  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
+   (set (match_dup 2)
+        (compare:CCUNS (match_dup 0)
+		    (match_dup 3)))]
+  ""
+  [(set_attr "type" "load")
+   (set_attr "cost" "8")
+   (set_attr "length" "8")])
+
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
new file mode 100755
index 00000000000..494537c9439
--- /dev/null
+++ b/gcc/config/rs6000/genfusion.pl
@@ -0,0 +1,144 @@
+#!/usr/bin/perl -w
+# Generate fusion.md 
+# Copyright (C) 2020 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+my $copyright =  <<'EOF';
+;; -*- buffer-read-only: t -*-
+;; Generated automatically by genfusion.pl
+
+;; Copyright (C) 2020 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it under
+;; the terms of the GNU General Public License as published by the Free
+;; Software Foundation; either version 3, or (at your option) any later
+;; version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
+;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+;; for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+EOF
+
+print $copyright;
+
+sub mode_to_ldst_char
+{
+    my ($mode) = @_;
+    if ($mode eq 'DI') { return 'd'; }
+    if ($mode eq 'SI') { return 'w'; }
+    if ($mode eq 'HI') { return 'h'; }
+    if ($mode eq 'QI') { return 'b'; }
+    return '?';
+}
+
+sub gen_ld_cmpi_p10
+{
+  LMODE: foreach $lmode ('DI','SI','HI','QI') {
+      $ldst = mode_to_ldst_char($lmode);
+      $clobbermode = $lmode;
+      # For clobber, we need a SI/DI reg in case we split because we have to sign/zero extend.
+      if ( $lmode eq 'HI' || $lmode eq 'QI' ) { $clobbermode = "GPR"; }
+    RESULT: foreach $result ('clobber', $lmode,  "EXT".$lmode) {
+	# EXTDI does not exist, and we cannot directly produce HI/QI results.
+	next RESULT if $result eq "EXTDI" || $result eq "HI" || $result eq "QI";
+	# Don't allow EXTQI because that would allow HI result which we can't do.
+	if ( $result eq "EXTQI" ) { $result = "GPR"; }
+      CCMODE: foreach $ccmode ('CC','CCUNS') {
+	  $np = "NON_PREFIXED_D";
+	  if ( $ccmode eq 'CC' ) {
+	      next CCMODE if $lmode eq 'QI';
+	      if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
+		  # ld and lwa are both DS-FORM.
+		  $np = "NON_PREFIXED_DS";
+	      }
+	      $cmpl = "";
+	      $echr = "a";
+	      $constpred = "const_m1_to_1_operand";
+	  } else {
+	      if ( $lmode eq 'DI' ) {
+		  # ld is DS-form, but lwz is not.
+		  $np = "NON_PREFIXED_DS";
+	      }
+	      $cmpl = "l";
+	      $echr = "z";
+	      $constpred = "const_0_to_1_operand";
+	  }
+	  if ($lmode eq 'DI') { $echr = ""; }
+	  if ($result =~ m/EXT/ || $result eq 'GPR' || $clobbermode eq 'GPR') {
+	      # We always need extension if result > lmode.
+	      if ( $ccmode eq 'CC' ) {
+		  $extend = "sign";
+	      } else {
+		  $extend = "zero";
+	      }
+	  } else {
+	      # Result of SI/DI does not need sign extension.
+	      $extend = "none";
+	  }
+	  print ";; load-cmpi fusion pattern generated by gen_ld_cmpi_p10\n";
+	  print ";; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend\n";
+
+	  print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";
+	  print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n";
+	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"non_update_memory_operand\" \"m\")\n";
+	  print "                 (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n";
+	  if ($result eq 'clobber') {
+	      print "   (clobber (match_scratch:${clobbermode} 0 \"=r\"))]\n";
+	  } elsif ($result eq $lmode) {
+	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (match_dup 1))]\n";
+	  } else {
+	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (${extend}_extend:${result} (match_dup 1)))]\n";
+	  }
+	  print "  \"(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)\"\n";
+	  print "  \"l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di 0,%0,%3\"\n";
+	  print "  \"&& reload_completed\n";
+	  print "   && (cc_reg_not_cr0_operand (operands[2], CCmode)\n";
+	  print "       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), ${lmode}mode, ${np}))\"\n";
+	  if ($extend eq "none") {
+	      print "  [(set (match_dup 0) (match_dup 1))\n";
+	  } else {
+	      $resultmode = $result;
+	      if ( $result eq 'clobber' ) { $resultmode = $clobbermode }
+	      print "  [(set (match_dup 0) (${extend}_extend:${resultmode} (match_dup 1)))\n";
+	  }
+	  print "   (set (match_dup 2)\n";
+	  print "        (compare:${ccmode} (match_dup 0)\n";
+	  print "		    (match_dup 3)))]\n";
+	  print "  \"\"\n";
+	  print "  [(set_attr \"type\" \"load\")\n";
+	  print "   (set_attr \"cost\" \"8\")\n";
+	  print "   (set_attr \"length\" \"8\")])\n";
+	  print "\n";
+      }
+    }
+  }
+}
+
+
+gen_ld_cmpi_p10();
+
+exit(0);
+
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 9ad5ae67302..78de8102f44 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -297,6 +297,11 @@ (define_predicate "const_0_to_1_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
 
+;; Match op = -1, op = 0, or op = 1.
+(define_predicate "const_m1_to_1_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), -1, 1)")))
+
 ;; Match op = 0..3.
 (define_predicate "const_0_to_3_operand"
   (and (match_code "const_int")
@@ -847,6 +852,15 @@ (define_special_predicate "update_address_mem"
 		    || GET_CODE (XEXP (op, 0)) == PRE_DEC
 		    || GET_CODE (XEXP (op, 0)) == PRE_MODIFY))"))
 
+;; Anything that matches memory_operand but does not update the address.
+(define_predicate "non_update_memory_operand"
+  (match_code "mem")
+{
+  if (update_address_mem (op, mode))
+    return 0;
+  return memory_operand (op, mode);
+})
+
 ;; Return 1 if the operand is a MEM with an indexed-form address.
 (define_special_predicate "indexed_address_mem"
   (match_test "(MEM_P (op)
diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index 8d2c1ffd6cf..3e65289d8df 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -82,7 +82,9 @@
 
 #define ISA_3_1_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
 				 | OPTION_MASK_POWER10			\
-				 | OTHER_POWER10_MASKS)
+				 | OTHER_POWER10_MASKS			\
+				 | OPTION_MASK_P10_FUSION		\
+				 | OPTION_MASK_P10_FUSION_LD_CMPI)
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS	(OPTION_MASK_FLOAT128_HW		\
@@ -129,6 +131,8 @@
 				 | OPTION_MASK_FLOAT128_KEYWORD		\
 				 | OPTION_MASK_FPRND			\
 				 | OPTION_MASK_POWER10			\
+				 | OPTION_MASK_P10_FUSION		\
+				 | OPTION_MASK_P10_FUSION_LD_CMPI	\
 				 | OPTION_MASK_HTM			\
 				 | OPTION_MASK_ISEL			\
 				 | OPTION_MASK_MFCRF			\
diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
index 3c4682b0e26..cd644083558 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -191,6 +191,8 @@ enum non_prefixed_form {
 
 extern enum insn_form address_to_insn_form (rtx, machine_mode,
 					    enum non_prefixed_form);
+extern bool address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
+				       enum non_prefixed_form non_prefix_format);
 extern bool prefixed_load_p (rtx_insn *);
 extern bool prefixed_store_p (rtx_insn *);
 extern bool prefixed_paddi_p (rtx_insn *);
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 517467ebc63..759551d07ec 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4423,6 +4423,12 @@ rs6000_option_override_internal (bool global_init_p)
   if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0)
     rs6000_isa_flags |= OPTION_MASK_MMA;
 
+  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
+    rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
+
+  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0)
+    rs6000_isa_flags |= OPTION_MASK_P10_FUSION_LD_CMPI;
+
   /* Turn off vector pair/mma options on non-power10 systems.  */
   else if (!TARGET_POWER10 && TARGET_MMA)
     {
@@ -23614,6 +23620,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "power9-minmax",		OPTION_MASK_P9_MINMAX,		false, true  },
   { "power9-misc",		OPTION_MASK_P9_MISC,		false, true  },
   { "power9-vector",		OPTION_MASK_P9_VECTOR,		false, true  },
+  { "power10-fusion",		OPTION_MASK_P10_FUSION,		false, true  },
   { "powerpc-gfxopt",		OPTION_MASK_PPC_GFXOPT,		false, true  },
   { "powerpc-gpopt",		OPTION_MASK_PPC_GPOPT,		false, true  },
   { "prefixed",			OPTION_MASK_PREFIXED,		false, true  },
@@ -25705,6 +25712,50 @@ address_to_insn_form (rtx addr,
   return INSN_FORM_BAD;
 }
 
+/* Given address rtx ADDR for a load of MODE, is this legitimate for a
+   non-prefixed D-form or X-form instruction?  NON_PREFIXED_FORMAT is
+   given NON_PREFIXED_D or NON_PREFIXED_DS to indicate whether we want
+   a D-form or DS-form instruction.  X-form and base_reg are always
+   allowed.  */
+bool
+address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
+			   enum non_prefixed_form non_prefixed_format)
+{
+  enum insn_form result_form;
+
+  result_form = address_to_insn_form (addr, mode, non_prefixed_format);
+
+  switch (non_prefixed_format)
+    {
+    case NON_PREFIXED_D:
+      switch (result_form)
+	{
+	case INSN_FORM_X:
+	case INSN_FORM_D:
+	case INSN_FORM_DS:
+	case INSN_FORM_BASE_REG:
+	  return true;
+	default:
+	  break;
+	}
+      break;
+    case NON_PREFIXED_DS:
+      switch (result_form)
+	{
+	case INSN_FORM_X:
+	case INSN_FORM_DS:
+	case INSN_FORM_BASE_REG:
+	  return true;
+	default:
+	  break;
+	}
+      break;
+    default:
+      break;
+    }
+  return false;
+}
+
 /* Helper function to see if we're potentially looking at lfs/stfs.
    - PARALLEL containing a SET and a CLOBBER
    - stfs:
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 5bf9c83fc1e..307c0b200bd 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -539,6 +539,7 @@ extern int rs6000_vector_align[];
 #define MASK_UPDATE			OPTION_MASK_UPDATE
 #define MASK_VSX			OPTION_MASK_VSX
 #define MASK_POWER10			OPTION_MASK_POWER10
+#define MASK_P10_FUSION			OPTION_MASK_P10_FUSION
 
 #ifndef IN_LIBGCC2
 #define MASK_POWERPC64			OPTION_MASK_POWERPC64
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b89990f46bf..c39b7098978 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14926,3 +14926,4 @@ (define_insn "*cmpeqb_internal"
 (include "dfp.md")
 (include "crypto.md")
 (include "htm.md")
+(include "fusion.md")
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 2888172cb27..008a318b98d 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -479,6 +479,14 @@ mpower8-vector
 Target Report Mask(P8_VECTOR) Var(rs6000_isa_flags)
 Use vector and scalar instructions added in ISA 2.07.
 
+mpower10-fusion
+Target Report Mask(P10_FUSION) Var(rs6000_isa_flags)
+Fuse certain integer operations together for better performance on power10.
+
+mpower10-fusion-ld-cmpi
+Target Undocumented Mask(P10_FUSION_LD_CMPI) Var(rs6000_isa_flags)
+Fuse certain integer operations together for better performance on power10.
+
 mcrypto
 Target Report Mask(CRYPTO) Var(rs6000_isa_flags)
 Use ISA 2.07 Category:Vector.AES and Category:Vector.SHA2 instructions.
diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
index 1ddb5729cb2..bcc71a9e21b 100644
--- a/gcc/config/rs6000/t-rs6000
+++ b/gcc/config/rs6000/t-rs6000
@@ -47,6 +47,9 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
 	$(COMPILE) $<
 	$(POSTCOMPILE)
 
+$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
+	$(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md
+
 $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \
   $(srcdir)/config/rs6000/rs6000-cpus.def
 	$(SHELL) $(srcdir)/config/rs6000/genopt.sh $(srcdir)/config/rs6000 > \
@@ -86,4 +89,5 @@ MD_INCLUDES = $(srcdir)/config/rs6000/rs64.md \
 	$(srcdir)/config/rs6000/mma.md \
 	$(srcdir)/config/rs6000/crypto.md \
 	$(srcdir)/config/rs6000/htm.md \
-	$(srcdir)/config/rs6000/dfp.md
+	$(srcdir)/config/rs6000/dfp.md \
+	$(srcdir)/config/rs6000/fusion.md
-- 
2.27.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion
  2020-12-04 19:19 [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion acsawdey
@ 2020-12-07 20:48 ` will schmidt
  2020-12-21 18:11 ` Pat Haugen
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: will schmidt @ 2020-12-07 20:48 UTC (permalink / raw)
  To: acsawdey, gcc-patches; +Cc: wschmidt, segher

On Fri, 2020-12-04 at 13:19 -0600, acsawdey--- via Gcc-patches wrote:
> From: Aaron Sawdey <acsawdey@linux.ibm.com>
> 

Assorted comments sprinkled around below.
thanks
-Will


> This patch adds the first batch of patterns to support p10 fusion. These
> will allow combine to create a single insn for a pair of instructions
> that that power10 can fuse and execute. These particular ones have the

Just one that, or maybe 'that the'.
s/ones/fusion pairs/ ?

> requirement that only cr0 can be used when fusing a load with a compare
> immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine
> to put that requirement in, and if it doesn't work out later the splitter
> can get used.

... splitter can get used, or ... splitter will <do something...>

> 
> The patterns are generated by a script genfusion.pl and live in new file
> fusion.md. This script will be expanded to generate more patterns for
> fusion.

ok

> 
> This also adds option -mpower10-fusion which defaults on for power10 and
> will gate all these fusion patterns. In addition I have added an
> undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
> that just controls the load+compare-immediate patterns. I have make

made

> these default on for power10 but they are not disallowed for earlier
> processors because it is still valid code. This allows us to test the
> correctness of fusion code generation by turning it on explicitly.
> 
> If bootstrap/regtest is clean, ok for trunk?
> 
> Thanks!
> 
>    Aaron
> 
> gcc/ChangeLog:
> 
> 	* config/rs6000/genfusion.pl: New file, script to generate
> 	define_insn_and_split patterns so combine can arrange fused
> 	instructions next to each other.

New script to generate ...

> 	* config/rs6000/fusion.md: New file, generated fused instruction
> 	patterns for combine.

> 	* config/rs6000/predicates.md (const_m1_to_1_operand): New predicate.
> 	(non_update_memory_operand): New predicate.
ok
> 	* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
> 	OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and
> 	POWERPC_MASKS.
> 	* config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add
> 	prototype.

All usages of address_is_non_pfx_d_or_x() appear to be negated, i.e. 
	+       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), 
	DImode, NON_PREFIXED_DS))" 
Fully understanding that naming is
hard, I'd wonder if that can be adjusted to avoid the double negative. 
something like (address_load_mode_requires_prefix (...foo) ?


> 	* config/rs6000/rs6000.c (rs6000_option_override_internal):
> 	automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
>  	if target is power10.  (rs600_opt_masks): Allow -mpower10-fusion
> 	in function attributes.  (address_is_non_pfx_d_or_x): New function.

ok

> 	* config/rs6000/rs6000.h: Add MASK_P10_FUSION.
> 	* config/rs6000/rs6000.md: Include fusion.md.
> 	* config/rs6000/rs6000.opt: Add -mpower10-fusion
> 	and -mpower10-fusion-ld-cmpi.

ok

> 	* config/rs6000/t-rs6000: Add dependencies involving fusion.md.

ok


> ---
>  gcc/config/rs6000/fusion.md       | 357 ++++++++++++++++++++++++++++++
>  gcc/config/rs6000/genfusion.pl    | 144 ++++++++++++
>  gcc/config/rs6000/predicates.md   |  14 ++
>  gcc/config/rs6000/rs6000-cpus.def |   6 +-
>  gcc/config/rs6000/rs6000-protos.h |   2 +
>  gcc/config/rs6000/rs6000.c        |  51 +++++
>  gcc/config/rs6000/rs6000.h        |   1 +
>  gcc/config/rs6000/rs6000.md       |   1 +
>  gcc/config/rs6000/rs6000.opt      |   8 +
>  gcc/config/rs6000/t-rs6000        |   6 +-
>  10 files changed, 588 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/config/rs6000/fusion.md
>  create mode 100755 gcc/config/rs6000/genfusion.pl
> 
> diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
> new file mode 100644
> index 00000000000..a4d3a6ae7f3
> --- /dev/null
> +++ b/gcc/config/rs6000/fusion.md
> @@ -0,0 +1,357 @@
> +;; -*- buffer-read-only: t -*-
> +;; Generated automatically by genfusion.pl
> +
> +;; Copyright (C) 2020 Free Software Foundation, Inc.
> +;;
> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it under
> +;; the terms of the GNU General Public License as published by the Free
> +;; Software Foundation; either version 3, or (at your option) any later
> +;; version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +;; for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is clobber compare mode is CC extend is none
> +(define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))
> +   (clobber (match_scratch:DI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
> +(define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:DI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is DI compare mode is CC extend is none
> +(define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is DI compare mode is CCUNS extend is none
> +(define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is clobber compare mode is CC extend is none
> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> +   (clobber (match_scratch:SI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is clobber compare mode is CCUNS extend is none
> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:SI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is SI compare mode is CC extend is none
> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is SI compare mode is CCUNS extend is none
> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (sign_extend:EXTSI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is EXTSI compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (zero_extend:EXTSI (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:EXTSI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is clobber compare mode is CC extend is sign
> +(define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_m1_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lha%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (sign_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is clobber compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lhz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is EXTHI compare mode is CC extend is sign
> +(define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (sign_extend:EXTHI (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lha%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (sign_extend:EXTHI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is EXTHI compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lhz_cmpldi_cr0_HI_EXTHI_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (zero_extend:EXTHI (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lhz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:EXTHI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is QI result mode is clobber compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lbz_cmpldi_cr0_QI_clobber_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
> +                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is QI result mode is GPR compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lbz_cmpldi_cr0_QI_GPR_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
> +                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:GPR 0 "gpc_reg_operand" "=r") (zero_extend:GPR (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +


Reviewed with a mix of in-depth analysis and a skim.. nothing jumped
out at me here.


> diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
> new file mode 100755
> index 00000000000..494537c9439
> --- /dev/null
> +++ b/gcc/config/rs6000/genfusion.pl
> @@ -0,0 +1,144 @@
> +#!/usr/bin/perl -w
> +# Generate fusion.md 
> +# Copyright (C) 2020 Free Software Foundation, Inc.
> +#
> +# This file is part of GCC.
> +#
> +# GCC is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3, or (at your option)
> +# any later version.
> +#
> +# GCC is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +my $copyright =  <<'EOF';


> +;; -*- buffer-read-only: t -*-
> +;; Generated automatically by genfusion.pl
> +
> +;; Copyright (C) 2020 Free Software Foundation, Inc.
> +;;


Embedding the date in an autogenerated file catches my eye.  I don't
see this in things like $GCC_BUILD/gcc/insn-recog.c ; I'm not sure it's
necessary in this case. (but prob doesn't hurt).  


> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it under
> +;; the terms of the GNU General Public License as published by the Free
> +;; Software Foundation; either version 3, or (at your option) any later
> +;; version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +;; for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +EOF
> +
> +print $copyright;
> +
> +sub mode_to_ldst_char
> +{
> +    my ($mode) = @_;
> +    if ($mode eq 'DI') { return 'd'; }
> +    if ($mode eq 'SI') { return 'w'; }
> +    if ($mode eq 'HI') { return 'h'; }
> +    if ($mode eq 'QI') { return 'b'; }
> +    return '?';
> +}
> +
> +sub gen_ld_cmpi_p10
> +{
> +  LMODE: foreach $lmode ('DI','SI','HI','QI') {
> +      $ldst = mode_to_ldst_char($lmode);
> +      $clobbermode = $lmode;
> +      # For clobber, we need a SI/DI reg in case we split because we have to sign/zero extend.
> +      if ( $lmode eq 'HI' || $lmode eq 'QI' ) { $clobbermode = "GPR"; }
> +    RESULT: foreach $result ('clobber', $lmode,  "EXT".$lmode) {
> +	# EXTDI does not exist, and we cannot directly produce HI/QI results.
> +	next RESULT if $result eq "EXTDI" || $result eq "HI" || $result eq "QI";
> +	# Don't allow EXTQI because that would allow HI result which we can't do.
> +	if ( $result eq "EXTQI" ) { $result = "GPR"; }
> +      CCMODE: foreach $ccmode ('CC','CCUNS') {
> +	  $np = "NON_PREFIXED_D";
> +	  if ( $ccmode eq 'CC' ) {
> +	      next CCMODE if $lmode eq 'QI';
> +	      if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
> +		  # ld and lwa are both DS-FORM.
> +		  $np = "NON_PREFIXED_DS";
> +	      }
> +	      $cmpl = "";
> +	      $echr = "a";
> +	      $constpred = "const_m1_to_1_operand";
> +	  } else {
> +	      if ( $lmode eq 'DI' ) {
> +		  # ld is DS-form, but lwz is not.
> +		  $np = "NON_PREFIXED_DS";
> +	      }
> +	      $cmpl = "l";
> +	      $echr = "z";
> +	      $constpred = "const_0_to_1_operand";
> +	  }
> +	  if ($lmode eq 'DI') { $echr = ""; }
> +	  if ($result =~ m/EXT/ || $result eq 'GPR' || $clobbermode eq 'GPR') {
> +	      # We always need extension if result > lmode.
> +	      if ( $ccmode eq 'CC' ) {
> +		  $extend = "sign";
> +	      } else {
> +		  $extend = "zero";
> +	      }
> +	  } else {
> +	      # Result of SI/DI does not need sign extension.
> +	      $extend = "none";
> +	  }
> +	  print ";; load-cmpi fusion pattern generated by gen_ld_cmpi_p10\n";
> +	  print ";; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend\n";
> +
> +	  print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";
> +	  print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n";
> +	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"non_update_memory_operand\" \"m\")\n";
> +	  print "                 (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n";
> +	  if ($result eq 'clobber') {
> +	      print "   (clobber (match_scratch:${clobbermode} 0 \"=r\"))]\n";
> +	  } elsif ($result eq $lmode) {
> +	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (match_dup 1))]\n";
> +	  } else {
> +	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (${extend}_extend:${result} (match_dup 1)))]\n";
> +	  }
> +	  print "  \"(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)\"\n";
> +	  print "  \"l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di 0,%0,%3\"\n";
> +	  print "  \"&& reload_completed\n";
> +	  print "   && (cc_reg_not_cr0_operand (operands[2], CCmode)\n";
> +	  print "       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), ${lmode}mode, ${np}))\"\n";
> +	  if ($extend eq "none") {
> +	      print "  [(set (match_dup 0) (match_dup 1))\n";
> +	  } else {
> +	      $resultmode = $result;
> +	      if ( $result eq 'clobber' ) { $resultmode = $clobbermode }
> +	      print "  [(set (match_dup 0) (${extend}_extend:${resultmode} (match_dup 1)))\n";
> +	  }
> +	  print "   (set (match_dup 2)\n";
> +	  print "        (compare:${ccmode} (match_dup 0)\n";
> +	  print "		    (match_dup 3)))]\n";
> +	  print "  \"\"\n";
> +	  print "  [(set_attr \"type\" \"load\")\n";
> +	  print "   (set_attr \"cost\" \"8\")\n";
> +	  print "   (set_attr \"length\" \"8\")])\n";
> +	  print "\n";
> +      }
> +    }
> +  }
> +}

Looked over, seems OK.   presumably testing will reveal any issues. :-)


> +
> +
> +gen_ld_cmpi_p10();
> +
> +exit(0);
> +
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index 9ad5ae67302..78de8102f44 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -297,6 +297,11 @@ (define_predicate "const_0_to_1_operand"
>    (and (match_code "const_int")
>         (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
> 
> +;; Match op = -1, op = 0, or op = 1.
> +(define_predicate "const_m1_to_1_operand"
> +  (and (match_code "const_int")
> +       (match_test "IN_RANGE (INTVAL (op), -1, 1)")))
> +

What does the _m1 indicate here?  (I can't tell from pre-existing usage
if it's negative, or match or mode or something other..)


>  ;; Match op = 0..3.
>  (define_predicate "const_0_to_3_operand"
>    (and (match_code "const_int")
> @@ -847,6 +852,15 @@ (define_special_predicate "update_address_mem"
>  		    || GET_CODE (XEXP (op, 0)) == PRE_DEC
>  		    || GET_CODE (XEXP (op, 0)) == PRE_MODIFY))"))
> 
> +;; Anything that matches memory_operand but does not update the address.
> +(define_predicate "non_update_memory_operand"
> +  (match_code "mem")
> +{
> +  if (update_address_mem (op, mode))
> +    return 0;
> +  return memory_operand (op, mode);
> +})
> +
>  ;; Return 1 if the operand is a MEM with an indexed-form address.
>  (define_special_predicate "indexed_address_mem"
>    (match_test "(MEM_P (op)
> diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
> index 8d2c1ffd6cf..3e65289d8df 100644
> --- a/gcc/config/rs6000/rs6000-cpus.def
> +++ b/gcc/config/rs6000/rs6000-cpus.def
> @@ -82,7 +82,9 @@
> 
>  #define ISA_3_1_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
>  				 | OPTION_MASK_POWER10			\
> -				 | OTHER_POWER10_MASKS)
> +				 | OTHER_POWER10_MASKS			\
> +				 | OPTION_MASK_P10_FUSION		\
> +				 | OPTION_MASK_P10_FUSION_LD_CMPI)
> 
>  /* Flags that need to be turned off if -mno-power9-vector.  */
>  #define OTHER_P9_VECTOR_MASKS	(OPTION_MASK_FLOAT128_HW		\
> @@ -129,6 +131,8 @@
>  				 | OPTION_MASK_FLOAT128_KEYWORD		\
>  				 | OPTION_MASK_FPRND			\
>  				 | OPTION_MASK_POWER10			\
> +				 | OPTION_MASK_P10_FUSION		\
> +				 | OPTION_MASK_P10_FUSION_LD_CMPI	\
>  				 | OPTION_MASK_HTM			\
>  				 | OPTION_MASK_ISEL			\
>  				 | OPTION_MASK_MFCRF			\

ok

> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
> index 3c4682b0e26..cd644083558 100644
> --- a/gcc/config/rs6000/rs6000-protos.h
> +++ b/gcc/config/rs6000/rs6000-protos.h
> @@ -191,6 +191,8 @@ enum non_prefixed_form {
> 
>  extern enum insn_form address_to_insn_form (rtx, machine_mode,
>  					    enum non_prefixed_form);
> +extern bool address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
> +				       enum non_prefixed_form non_prefix_format);
>  extern bool prefixed_load_p (rtx_insn *);
>  extern bool prefixed_store_p (rtx_insn *);
>  extern bool prefixed_paddi_p (rtx_insn *);
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 517467ebc63..759551d07ec 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4423,6 +4423,12 @@ rs6000_option_override_internal (bool global_init_p)
>    if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0)
>      rs6000_isa_flags |= OPTION_MASK_MMA;
> 
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
> +    rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
> +
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0)
> +    rs6000_isa_flags |= OPTION_MASK_P10_FUSION_LD_CMPI;
> +
>    /* Turn off vector pair/mma options on non-power10 systems.  */
>    else if (!TARGET_POWER10 && TARGET_MMA)
>      {
> @@ -23614,6 +23620,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
>    { "power9-minmax",		OPTION_MASK_P9_MINMAX,		false, true  },
>    { "power9-misc",		OPTION_MASK_P9_MISC,		false, true  },
>    { "power9-vector",		OPTION_MASK_P9_VECTOR,		false, true  },
> +  { "power10-fusion",		OPTION_MASK_P10_FUSION,		false, true  },
>    { "powerpc-gfxopt",		OPTION_MASK_PPC_GFXOPT,		false, true  },
>    { "powerpc-gpopt",		OPTION_MASK_PPC_GPOPT,		false, true  },
>    { "prefixed",			OPTION_MASK_PREFIXED,		false, true  },
> @@ -25705,6 +25712,50 @@ address_to_insn_form (rtx addr,
>    return INSN_FORM_BAD;
>  }
> 


ok

> +/* Given address rtx ADDR for a load of MODE, is this legitimate for a
> +   non-prefixed D-form or X-form instruction?  NON_PREFIXED_FORMAT is
> +   given NON_PREFIXED_D or NON_PREFIXED_DS to indicate whether we want
> +   a D-form or DS-form instruction.  X-form and base_reg are always
> +   allowed.  */
> +bool
> +address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
> +			   enum non_prefixed_form non_prefixed_format)
> +{
> +  enum insn_form result_form;
> +
> +  result_form = address_to_insn_form (addr, mode, non_prefixed_format);
> +
> +  switch (non_prefixed_format)
> +    {
> +    case NON_PREFIXED_D:
> +      switch (result_form)
> +	{
> +	case INSN_FORM_X:
> +	case INSN_FORM_D:
> +	case INSN_FORM_DS:
> +	case INSN_FORM_BASE_REG:
> +	  return true;
> +	default:
> +	  break;
> +	}
> +      break;
> +    case NON_PREFIXED_DS:
> +      switch (result_form)
> +	{
> +	case INSN_FORM_X:
> +	case INSN_FORM_DS:
> +	case INSN_FORM_BASE_REG:
> +	  return true;
> +	default:
> +	  break;
> +	}
> +      break;
> +    default:
> +      break;
> +    }
> +  return false;
> +}
> +
>  /* Helper function to see if we're potentially looking at lfs/stfs.
>     - PARALLEL containing a SET and a CLOBBER
>     - stfs:


ok

> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index 5bf9c83fc1e..307c0b200bd 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -539,6 +539,7 @@ extern int rs6000_vector_align[];
>  #define MASK_UPDATE			OPTION_MASK_UPDATE
>  #define MASK_VSX			OPTION_MASK_VSX
>  #define MASK_POWER10			OPTION_MASK_POWER10
> +#define MASK_P10_FUSION			OPTION_MASK_P10_FUSION
> 
>  #ifndef IN_LIBGCC2
>  #define MASK_POWERPC64			OPTION_MASK_POWERPC64

ok

> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index b89990f46bf..c39b7098978 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -14926,3 +14926,4 @@ (define_insn "*cmpeqb_internal"
>  (include "dfp.md")
>  (include "crypto.md")
>  (include "htm.md")
> +(include "fusion.md")

ok

> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> index 2888172cb27..008a318b98d 100644
> --- a/gcc/config/rs6000/rs6000.opt
> +++ b/gcc/config/rs6000/rs6000.opt
> @@ -479,6 +479,14 @@ mpower8-vector
>  Target Report Mask(P8_VECTOR) Var(rs6000_isa_flags)
>  Use vector and scalar instructions added in ISA 2.07.
> 
> +mpower10-fusion
> +Target Report Mask(P10_FUSION) Var(rs6000_isa_flags)
> +Fuse certain integer operations together for better performance on power10.
> +
> +mpower10-fusion-ld-cmpi
> +Target Undocumented Mask(P10_FUSION_LD_CMPI) Var(rs6000_isa_flags)
> +Fuse certain integer operations together for better performance on power10.
> +
>  mcrypto
>  Target Report Mask(CRYPTO) Var(rs6000_isa_flags)
>  Use ISA 2.07 Category:Vector.AES and Category:Vector.SHA2 instructions.


ok

> diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
> index 1ddb5729cb2..bcc71a9e21b 100644
> --- a/gcc/config/rs6000/t-rs6000
> +++ b/gcc/config/rs6000/t-rs6000
> @@ -47,6 +47,9 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
>  	$(COMPILE) $<
>  	$(POSTCOMPILE)
> 
> +$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
> +	$(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md
> +
>  $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \
>    $(srcdir)/config/rs6000/rs6000-cpus.def
>  	$(SHELL) $(srcdir)/config/rs6000/genopt.sh $(srcdir)/config/rs6000 > \
> @@ -86,4 +89,5 @@ MD_INCLUDES = $(srcdir)/config/rs6000/rs64.md \
>  	$(srcdir)/config/rs6000/mma.md \
>  	$(srcdir)/config/rs6000/crypto.md \
>  	$(srcdir)/config/rs6000/htm.md \
> -	$(srcdir)/config/rs6000/dfp.md
> +	$(srcdir)/config/rs6000/dfp.md \
> +	$(srcdir)/config/rs6000/fusion.md


ok.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion
  2020-12-04 19:19 [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion acsawdey
  2020-12-07 20:48 ` will schmidt
@ 2020-12-21 18:11 ` Pat Haugen
  2020-12-21 22:48   ` Segher Boessenkool
  2021-01-03 20:42 ` Aaron Sawdey
  2021-01-26  0:51 ` Segher Boessenkool
  3 siblings, 1 reply; 7+ messages in thread
From: Pat Haugen @ 2020-12-21 18:11 UTC (permalink / raw)
  To: acsawdey, gcc-patches; +Cc: wschmidt, segher

On 12/4/20 1:19 PM, acsawdey--- via Gcc-patches wrote:
> +	  print "  [(set_attr \"type\" \"load\")\n";

We need to tag these with a new instruction type, such as 'fused-load-cmp', so the scheduler can distinguish them from normal loads.

-Pat

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion
  2020-12-21 18:11 ` Pat Haugen
@ 2020-12-21 22:48   ` Segher Boessenkool
  0 siblings, 0 replies; 7+ messages in thread
From: Segher Boessenkool @ 2020-12-21 22:48 UTC (permalink / raw)
  To: Pat Haugen; +Cc: acsawdey, gcc-patches, wschmidt

On Mon, Dec 21, 2020 at 12:11:44PM -0600, Pat Haugen wrote:
> On 12/4/20 1:19 PM, acsawdey--- via Gcc-patches wrote:
> > +	  print "  [(set_attr \"type\" \"load\")\n";
> 
> We need to tag these with a new instruction type, such as 'fused-load-cmp', so the scheduler can distinguish them from normal loads.

Yeah...  and the insn_cost function can use that to give better costs
for such fused instructions, as well (it will right now do 12 for
load+cmp, not all that bad -- for combine that always counts as better
that separate load (8) and cmp (4) insns, since it is "just one insn"...
but we can do better than that, make it a bit cheaper).


Segher

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion
  2020-12-04 19:19 [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion acsawdey
  2020-12-07 20:48 ` will schmidt
  2020-12-21 18:11 ` Pat Haugen
@ 2021-01-03 20:42 ` Aaron Sawdey
  2021-01-19  4:43   ` Aaron Sawdey
  2021-01-26  0:51 ` Segher Boessenkool
  3 siblings, 1 reply; 7+ messages in thread
From: Aaron Sawdey @ 2021-01-03 20:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, Bill Schmidt, Pat Haugen

Ping.

I assume we’re going to want a separate patch for the new instruction type.

Aaron Sawdey, Ph.D. sawdey@linux.ibm.com
IBM Linux on POWER Toolchain
 

> On Dec 4, 2020, at 1:19 PM, acsawdey@linux.ibm.com wrote:
> 
> From: Aaron Sawdey <acsawdey@linux.ibm.com>
> 
> This patch adds the first batch of patterns to support p10 fusion. These
> will allow combine to create a single insn for a pair of instructions
> that that power10 can fuse and execute. These particular ones have the
> requirement that only cr0 can be used when fusing a load with a compare
> immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine
> to put that requirement in, and if it doesn't work out later the splitter
> can get used.
> 
> The patterns are generated by a script genfusion.pl and live in new file
> fusion.md. This script will be expanded to generate more patterns for
> fusion.
> 
> This also adds option -mpower10-fusion which defaults on for power10 and
> will gate all these fusion patterns. In addition I have added an
> undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
> that just controls the load+compare-immediate patterns. I have make
> these default on for power10 but they are not disallowed for earlier
> processors because it is still valid code. This allows us to test the
> correctness of fusion code generation by turning it on explicitly.
> 
> If bootstrap/regtest is clean, ok for trunk?
> 
> Thanks!
> 
>   Aaron
> 
> gcc/ChangeLog:
> 
> 	* config/rs6000/genfusion.pl: New file, script to generate
> 	define_insn_and_split patterns so combine can arrange fused
> 	instructions next to each other.
> 	* config/rs6000/fusion.md: New file, generated fused instruction
> 	patterns for combine.
> 	* config/rs6000/predicates.md (const_m1_to_1_operand): New predicate.
> 	(non_update_memory_operand): New predicate.
> 	* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
> 	OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and
> 	POWERPC_MASKS.
> 	* config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add
> 	prototype.
> 	* config/rs6000/rs6000.c (rs6000_option_override_internal):
> 	automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
> 	if target is power10.  (rs600_opt_masks): Allow -mpower10-fusion
> 	in function attributes.  (address_is_non_pfx_d_or_x): New function.
> 	* config/rs6000/rs6000.h: Add MASK_P10_FUSION.
> 	* config/rs6000/rs6000.md: Include fusion.md.
> 	* config/rs6000/rs6000.opt: Add -mpower10-fusion
> 	and -mpower10-fusion-ld-cmpi.
> 	* config/rs6000/t-rs6000: Add dependencies involving fusion.md.
> ---
> gcc/config/rs6000/fusion.md       | 357 ++++++++++++++++++++++++++++++
> gcc/config/rs6000/genfusion.pl    | 144 ++++++++++++
> gcc/config/rs6000/predicates.md   |  14 ++
> gcc/config/rs6000/rs6000-cpus.def |   6 +-
> gcc/config/rs6000/rs6000-protos.h |   2 +
> gcc/config/rs6000/rs6000.c        |  51 +++++
> gcc/config/rs6000/rs6000.h        |   1 +
> gcc/config/rs6000/rs6000.md       |   1 +
> gcc/config/rs6000/rs6000.opt      |   8 +
> gcc/config/rs6000/t-rs6000        |   6 +-
> 10 files changed, 588 insertions(+), 2 deletions(-)
> create mode 100644 gcc/config/rs6000/fusion.md
> create mode 100755 gcc/config/rs6000/genfusion.pl
> 
> diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
> new file mode 100644
> index 00000000000..a4d3a6ae7f3
> --- /dev/null
> +++ b/gcc/config/rs6000/fusion.md
> @@ -0,0 +1,357 @@
> +;; -*- buffer-read-only: t -*-
> +;; Generated automatically by genfusion.pl
> +
> +;; Copyright (C) 2020 Free Software Foundation, Inc.
> +;;
> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it under
> +;; the terms of the GNU General Public License as published by the Free
> +;; Software Foundation; either version 3, or (at your option) any later
> +;; version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +;; for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is clobber compare mode is CC extend is none
> +(define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))
> +   (clobber (match_scratch:DI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
> +(define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:DI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is DI compare mode is CC extend is none
> +(define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is DI compare mode is CCUNS extend is none
> +(define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is clobber compare mode is CC extend is none
> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> +   (clobber (match_scratch:SI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is clobber compare mode is CCUNS extend is none
> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:SI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is SI compare mode is CC extend is none
> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is SI compare mode is CCUNS extend is none
> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
> +  [(set (match_dup 0) (sign_extend:EXTSI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is SI result mode is EXTSI compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (zero_extend:EXTSI (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:EXTSI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is clobber compare mode is CC extend is sign
> +(define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_m1_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lha%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (sign_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is clobber compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lhz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is EXTHI compare mode is CC extend is sign
> +(define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_m1_to_1_operand" "n")))
> +   (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (sign_extend:EXTHI (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lha%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (sign_extend:EXTHI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is HI result mode is EXTHI compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lhz_cmpldi_cr0_HI_EXTHI_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m")
> +                 (match_operand:HI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (zero_extend:EXTHI (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lhz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:EXTHI (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is QI result mode is clobber compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lbz_cmpldi_cr0_QI_clobber_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
> +                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is QI result mode is GPR compare mode is CCUNS extend is zero
> +(define_insn_and_split "*lbz_cmpldi_cr0_QI_GPR_CCUNS_zero"
> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
> +                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
> +   (set (match_operand:GPR 0 "gpc_reg_operand" "=r") (zero_extend:GPR (match_dup 1)))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, NON_PREFIXED_D))"
> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
> +   (set (match_dup 2)
> +        (compare:CCUNS (match_dup 0)
> +		    (match_dup 3)))]
> +  ""
> +  [(set_attr "type" "load")
> +   (set_attr "cost" "8")
> +   (set_attr "length" "8")])
> +
> diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
> new file mode 100755
> index 00000000000..494537c9439
> --- /dev/null
> +++ b/gcc/config/rs6000/genfusion.pl
> @@ -0,0 +1,144 @@
> +#!/usr/bin/perl -w
> +# Generate fusion.md 
> +# Copyright (C) 2020 Free Software Foundation, Inc.
> +#
> +# This file is part of GCC.
> +#
> +# GCC is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3, or (at your option)
> +# any later version.
> +#
> +# GCC is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +my $copyright =  <<'EOF';
> +;; -*- buffer-read-only: t -*-
> +;; Generated automatically by genfusion.pl
> +
> +;; Copyright (C) 2020 Free Software Foundation, Inc.
> +;;
> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it under
> +;; the terms of the GNU General Public License as published by the Free
> +;; Software Foundation; either version 3, or (at your option) any later
> +;; version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +;; for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +EOF
> +
> +print $copyright;
> +
> +sub mode_to_ldst_char
> +{
> +    my ($mode) = @_;
> +    if ($mode eq 'DI') { return 'd'; }
> +    if ($mode eq 'SI') { return 'w'; }
> +    if ($mode eq 'HI') { return 'h'; }
> +    if ($mode eq 'QI') { return 'b'; }
> +    return '?';
> +}
> +
> +sub gen_ld_cmpi_p10
> +{
> +  LMODE: foreach $lmode ('DI','SI','HI','QI') {
> +      $ldst = mode_to_ldst_char($lmode);
> +      $clobbermode = $lmode;
> +      # For clobber, we need a SI/DI reg in case we split because we have to sign/zero extend.
> +      if ( $lmode eq 'HI' || $lmode eq 'QI' ) { $clobbermode = "GPR"; }
> +    RESULT: foreach $result ('clobber', $lmode,  "EXT".$lmode) {
> +	# EXTDI does not exist, and we cannot directly produce HI/QI results.
> +	next RESULT if $result eq "EXTDI" || $result eq "HI" || $result eq "QI";
> +	# Don't allow EXTQI because that would allow HI result which we can't do.
> +	if ( $result eq "EXTQI" ) { $result = "GPR"; }
> +      CCMODE: foreach $ccmode ('CC','CCUNS') {
> +	  $np = "NON_PREFIXED_D";
> +	  if ( $ccmode eq 'CC' ) {
> +	      next CCMODE if $lmode eq 'QI';
> +	      if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
> +		  # ld and lwa are both DS-FORM.
> +		  $np = "NON_PREFIXED_DS";
> +	      }
> +	      $cmpl = "";
> +	      $echr = "a";
> +	      $constpred = "const_m1_to_1_operand";
> +	  } else {
> +	      if ( $lmode eq 'DI' ) {
> +		  # ld is DS-form, but lwz is not.
> +		  $np = "NON_PREFIXED_DS";
> +	      }
> +	      $cmpl = "l";
> +	      $echr = "z";
> +	      $constpred = "const_0_to_1_operand";
> +	  }
> +	  if ($lmode eq 'DI') { $echr = ""; }
> +	  if ($result =~ m/EXT/ || $result eq 'GPR' || $clobbermode eq 'GPR') {
> +	      # We always need extension if result > lmode.
> +	      if ( $ccmode eq 'CC' ) {
> +		  $extend = "sign";
> +	      } else {
> +		  $extend = "zero";
> +	      }
> +	  } else {
> +	      # Result of SI/DI does not need sign extension.
> +	      $extend = "none";
> +	  }
> +	  print ";; load-cmpi fusion pattern generated by gen_ld_cmpi_p10\n";
> +	  print ";; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend\n";
> +
> +	  print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";
> +	  print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n";
> +	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"non_update_memory_operand\" \"m\")\n";
> +	  print "                 (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n";
> +	  if ($result eq 'clobber') {
> +	      print "   (clobber (match_scratch:${clobbermode} 0 \"=r\"))]\n";
> +	  } elsif ($result eq $lmode) {
> +	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (match_dup 1))]\n";
> +	  } else {
> +	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (${extend}_extend:${result} (match_dup 1)))]\n";
> +	  }
> +	  print "  \"(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)\"\n";
> +	  print "  \"l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di 0,%0,%3\"\n";
> +	  print "  \"&& reload_completed\n";
> +	  print "   && (cc_reg_not_cr0_operand (operands[2], CCmode)\n";
> +	  print "       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), ${lmode}mode, ${np}))\"\n";
> +	  if ($extend eq "none") {
> +	      print "  [(set (match_dup 0) (match_dup 1))\n";
> +	  } else {
> +	      $resultmode = $result;
> +	      if ( $result eq 'clobber' ) { $resultmode = $clobbermode }
> +	      print "  [(set (match_dup 0) (${extend}_extend:${resultmode} (match_dup 1)))\n";
> +	  }
> +	  print "   (set (match_dup 2)\n";
> +	  print "        (compare:${ccmode} (match_dup 0)\n";
> +	  print "		    (match_dup 3)))]\n";
> +	  print "  \"\"\n";
> +	  print "  [(set_attr \"type\" \"load\")\n";
> +	  print "   (set_attr \"cost\" \"8\")\n";
> +	  print "   (set_attr \"length\" \"8\")])\n";
> +	  print "\n";
> +      }
> +    }
> +  }
> +}
> +
> +
> +gen_ld_cmpi_p10();
> +
> +exit(0);
> +
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index 9ad5ae67302..78de8102f44 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -297,6 +297,11 @@ (define_predicate "const_0_to_1_operand"
>   (and (match_code "const_int")
>        (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
> 
> +;; Match op = -1, op = 0, or op = 1.
> +(define_predicate "const_m1_to_1_operand"
> +  (and (match_code "const_int")
> +       (match_test "IN_RANGE (INTVAL (op), -1, 1)")))
> +
> ;; Match op = 0..3.
> (define_predicate "const_0_to_3_operand"
>   (and (match_code "const_int")
> @@ -847,6 +852,15 @@ (define_special_predicate "update_address_mem"
> 		    || GET_CODE (XEXP (op, 0)) == PRE_DEC
> 		    || GET_CODE (XEXP (op, 0)) == PRE_MODIFY))"))
> 
> +;; Anything that matches memory_operand but does not update the address.
> +(define_predicate "non_update_memory_operand"
> +  (match_code "mem")
> +{
> +  if (update_address_mem (op, mode))
> +    return 0;
> +  return memory_operand (op, mode);
> +})
> +
> ;; Return 1 if the operand is a MEM with an indexed-form address.
> (define_special_predicate "indexed_address_mem"
>   (match_test "(MEM_P (op)
> diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
> index 8d2c1ffd6cf..3e65289d8df 100644
> --- a/gcc/config/rs6000/rs6000-cpus.def
> +++ b/gcc/config/rs6000/rs6000-cpus.def
> @@ -82,7 +82,9 @@
> 
> #define ISA_3_1_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
> 				 | OPTION_MASK_POWER10			\
> -				 | OTHER_POWER10_MASKS)
> +				 | OTHER_POWER10_MASKS			\
> +				 | OPTION_MASK_P10_FUSION		\
> +				 | OPTION_MASK_P10_FUSION_LD_CMPI)
> 
> /* Flags that need to be turned off if -mno-power9-vector.  */
> #define OTHER_P9_VECTOR_MASKS	(OPTION_MASK_FLOAT128_HW		\
> @@ -129,6 +131,8 @@
> 				 | OPTION_MASK_FLOAT128_KEYWORD		\
> 				 | OPTION_MASK_FPRND			\
> 				 | OPTION_MASK_POWER10			\
> +				 | OPTION_MASK_P10_FUSION		\
> +				 | OPTION_MASK_P10_FUSION_LD_CMPI	\
> 				 | OPTION_MASK_HTM			\
> 				 | OPTION_MASK_ISEL			\
> 				 | OPTION_MASK_MFCRF			\
> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
> index 3c4682b0e26..cd644083558 100644
> --- a/gcc/config/rs6000/rs6000-protos.h
> +++ b/gcc/config/rs6000/rs6000-protos.h
> @@ -191,6 +191,8 @@ enum non_prefixed_form {
> 
> extern enum insn_form address_to_insn_form (rtx, machine_mode,
> 					    enum non_prefixed_form);
> +extern bool address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
> +				       enum non_prefixed_form non_prefix_format);
> extern bool prefixed_load_p (rtx_insn *);
> extern bool prefixed_store_p (rtx_insn *);
> extern bool prefixed_paddi_p (rtx_insn *);
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 517467ebc63..759551d07ec 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4423,6 +4423,12 @@ rs6000_option_override_internal (bool global_init_p)
>   if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0)
>     rs6000_isa_flags |= OPTION_MASK_MMA;
> 
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
> +    rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
> +
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0)
> +    rs6000_isa_flags |= OPTION_MASK_P10_FUSION_LD_CMPI;
> +
>   /* Turn off vector pair/mma options on non-power10 systems.  */
>   else if (!TARGET_POWER10 && TARGET_MMA)
>     {
> @@ -23614,6 +23620,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
>   { "power9-minmax",		OPTION_MASK_P9_MINMAX,		false, true  },
>   { "power9-misc",		OPTION_MASK_P9_MISC,		false, true  },
>   { "power9-vector",		OPTION_MASK_P9_VECTOR,		false, true  },
> +  { "power10-fusion",		OPTION_MASK_P10_FUSION,		false, true  },
>   { "powerpc-gfxopt",		OPTION_MASK_PPC_GFXOPT,		false, true  },
>   { "powerpc-gpopt",		OPTION_MASK_PPC_GPOPT,		false, true  },
>   { "prefixed",			OPTION_MASK_PREFIXED,		false, true  },
> @@ -25705,6 +25712,50 @@ address_to_insn_form (rtx addr,
>   return INSN_FORM_BAD;
> }
> 
> +/* Given address rtx ADDR for a load of MODE, is this legitimate for a
> +   non-prefixed D-form or X-form instruction?  NON_PREFIXED_FORMAT is
> +   given NON_PREFIXED_D or NON_PREFIXED_DS to indicate whether we want
> +   a D-form or DS-form instruction.  X-form and base_reg are always
> +   allowed.  */
> +bool
> +address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
> +			   enum non_prefixed_form non_prefixed_format)
> +{
> +  enum insn_form result_form;
> +
> +  result_form = address_to_insn_form (addr, mode, non_prefixed_format);
> +
> +  switch (non_prefixed_format)
> +    {
> +    case NON_PREFIXED_D:
> +      switch (result_form)
> +	{
> +	case INSN_FORM_X:
> +	case INSN_FORM_D:
> +	case INSN_FORM_DS:
> +	case INSN_FORM_BASE_REG:
> +	  return true;
> +	default:
> +	  break;
> +	}
> +      break;
> +    case NON_PREFIXED_DS:
> +      switch (result_form)
> +	{
> +	case INSN_FORM_X:
> +	case INSN_FORM_DS:
> +	case INSN_FORM_BASE_REG:
> +	  return true;
> +	default:
> +	  break;
> +	}
> +      break;
> +    default:
> +      break;
> +    }
> +  return false;
> +}
> +
> /* Helper function to see if we're potentially looking at lfs/stfs.
>    - PARALLEL containing a SET and a CLOBBER
>    - stfs:
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index 5bf9c83fc1e..307c0b200bd 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -539,6 +539,7 @@ extern int rs6000_vector_align[];
> #define MASK_UPDATE			OPTION_MASK_UPDATE
> #define MASK_VSX			OPTION_MASK_VSX
> #define MASK_POWER10			OPTION_MASK_POWER10
> +#define MASK_P10_FUSION			OPTION_MASK_P10_FUSION
> 
> #ifndef IN_LIBGCC2
> #define MASK_POWERPC64			OPTION_MASK_POWERPC64
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index b89990f46bf..c39b7098978 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -14926,3 +14926,4 @@ (define_insn "*cmpeqb_internal"
> (include "dfp.md")
> (include "crypto.md")
> (include "htm.md")
> +(include "fusion.md")
> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> index 2888172cb27..008a318b98d 100644
> --- a/gcc/config/rs6000/rs6000.opt
> +++ b/gcc/config/rs6000/rs6000.opt
> @@ -479,6 +479,14 @@ mpower8-vector
> Target Report Mask(P8_VECTOR) Var(rs6000_isa_flags)
> Use vector and scalar instructions added in ISA 2.07.
> 
> +mpower10-fusion
> +Target Report Mask(P10_FUSION) Var(rs6000_isa_flags)
> +Fuse certain integer operations together for better performance on power10.
> +
> +mpower10-fusion-ld-cmpi
> +Target Undocumented Mask(P10_FUSION_LD_CMPI) Var(rs6000_isa_flags)
> +Fuse certain integer operations together for better performance on power10.
> +
> mcrypto
> Target Report Mask(CRYPTO) Var(rs6000_isa_flags)
> Use ISA 2.07 Category:Vector.AES and Category:Vector.SHA2 instructions.
> diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
> index 1ddb5729cb2..bcc71a9e21b 100644
> --- a/gcc/config/rs6000/t-rs6000
> +++ b/gcc/config/rs6000/t-rs6000
> @@ -47,6 +47,9 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
> 	$(COMPILE) $<
> 	$(POSTCOMPILE)
> 
> +$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
> +	$(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md
> +
> $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \
>   $(srcdir)/config/rs6000/rs6000-cpus.def
> 	$(SHELL) $(srcdir)/config/rs6000/genopt.sh $(srcdir)/config/rs6000 > \
> @@ -86,4 +89,5 @@ MD_INCLUDES = $(srcdir)/config/rs6000/rs64.md \
> 	$(srcdir)/config/rs6000/mma.md \
> 	$(srcdir)/config/rs6000/crypto.md \
> 	$(srcdir)/config/rs6000/htm.md \
> -	$(srcdir)/config/rs6000/dfp.md
> +	$(srcdir)/config/rs6000/dfp.md \
> +	$(srcdir)/config/rs6000/fusion.md
> -- 
> 2.27.0
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion
  2021-01-03 20:42 ` Aaron Sawdey
@ 2021-01-19  4:43   ` Aaron Sawdey
  0 siblings, 0 replies; 7+ messages in thread
From: Aaron Sawdey @ 2021-01-19  4:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, Bill Schmidt, Pat Haugen

Ping.

Aaron Sawdey, Ph.D. sawdey@linux.ibm.com
IBM Linux on POWER Toolchain
 

> On Jan 3, 2021, at 2:42 PM, Aaron Sawdey <acsawdey@linux.ibm.com> wrote:
> 
> Ping.
> 
> I assume we’re going to want a separate patch for the new instruction type.
> 
> Aaron Sawdey, Ph.D. sawdey@linux.ibm.com
> IBM Linux on POWER Toolchain
> 
> 
>> On Dec 4, 2020, at 1:19 PM, acsawdey@linux.ibm.com wrote:
>> 
>> From: Aaron Sawdey <acsawdey@linux.ibm.com>
>> 
>> This patch adds the first batch of patterns to support p10 fusion. These
>> will allow combine to create a single insn for a pair of instructions
>> that that power10 can fuse and execute. These particular ones have the
>> requirement that only cr0 can be used when fusing a load with a compare
>> immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine
>> to put that requirement in, and if it doesn't work out later the splitter
>> can get used.
>> 
>> The patterns are generated by a script genfusion.pl and live in new file
>> fusion.md. This script will be expanded to generate more patterns for
>> fusion.
>> 
>> This also adds option -mpower10-fusion which defaults on for power10 and
>> will gate all these fusion patterns. In addition I have added an
>> undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
>> that just controls the load+compare-immediate patterns. I have make
>> these default on for power10 but they are not disallowed for earlier
>> processors because it is still valid code. This allows us to test the
>> correctness of fusion code generation by turning it on explicitly.
>> 
>> If bootstrap/regtest is clean, ok for trunk?
>> 
>> Thanks!
>> 
>>  Aaron
>> 
>> gcc/ChangeLog:
>> 
>> 	* config/rs6000/genfusion.pl: New file, script to generate
>> 	define_insn_and_split patterns so combine can arrange fused
>> 	instructions next to each other.
>> 	* config/rs6000/fusion.md: New file, generated fused instruction
>> 	patterns for combine.
>> 	* config/rs6000/predicates.md (const_m1_to_1_operand): New predicate.
>> 	(non_update_memory_operand): New predicate.
>> 	* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
>> 	OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and
>> 	POWERPC_MASKS.
>> 	* config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add
>> 	prototype.
>> 	* config/rs6000/rs6000.c (rs6000_option_override_internal):
>> 	automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
>> 	if target is power10.  (rs600_opt_masks): Allow -mpower10-fusion
>> 	in function attributes.  (address_is_non_pfx_d_or_x): New function.
>> 	* config/rs6000/rs6000.h: Add MASK_P10_FUSION.
>> 	* config/rs6000/rs6000.md: Include fusion.md.
>> 	* config/rs6000/rs6000.opt: Add -mpower10-fusion
>> 	and -mpower10-fusion-ld-cmpi.
>> 	* config/rs6000/t-rs6000: Add dependencies involving fusion.md.
>> ---
>> gcc/config/rs6000/fusion.md       | 357 ++++++++++++++++++++++++++++++
>> gcc/config/rs6000/genfusion.pl    | 144 ++++++++++++
>> gcc/config/rs6000/predicates.md   |  14 ++
>> gcc/config/rs6000/rs6000-cpus.def |   6 +-
>> gcc/config/rs6000/rs6000-protos.h |   2 +
>> gcc/config/rs6000/rs6000.c        |  51 +++++
>> gcc/config/rs6000/rs6000.h        |   1 +
>> gcc/config/rs6000/rs6000.md       |   1 +
>> gcc/config/rs6000/rs6000.opt      |   8 +
>> gcc/config/rs6000/t-rs6000        |   6 +-
>> 10 files changed, 588 insertions(+), 2 deletions(-)
>> create mode 100644 gcc/config/rs6000/fusion.md
>> create mode 100755 gcc/config/rs6000/genfusion.pl
>> 
>> diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
>> new file mode 100644
>> index 00000000000..a4d3a6ae7f3
>> --- /dev/null
>> +++ b/gcc/config/rs6000/fusion.md
>> @@ -0,0 +1,357 @@
>> +;; -*- buffer-read-only: t -*-
>> +;; Generated automatically by genfusion.pl
>> +
>> +;; Copyright (C) 2020 Free Software Foundation, Inc.
>> +;;
>> +;; This file is part of GCC.
>> +;;
>> +;; GCC is free software; you can redistribute it and/or modify it under
>> +;; the terms of the GNU General Public License as published by the Free
>> +;; Software Foundation; either version 3, or (at your option) any later
>> +;; version.
>> +;;
>> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +;; for more details.
>> +;;
>> +;; You should have received a copy of the GNU General Public License
>> +;; along with GCC; see the file COPYING3.  If not see
>> +;; <http://www.gnu.org/licenses/>.
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is DI result mode is clobber compare mode is CC extend is none
>> +(define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
>> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
>> +        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))
>> +   (clobber (match_scratch:DI 0 "=r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
>> +  [(set (match_dup 0) (match_dup 1))
>> +   (set (match_dup 2)
>> +        (compare:CC (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
>> +(define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
>> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
>> +        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:DI 3 "const_0_to_1_operand" "n")))
>> +   (clobber (match_scratch:DI 0 "=r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
>> +  [(set (match_dup 0) (match_dup 1))
>> +   (set (match_dup 2)
>> +        (compare:CCUNS (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is DI result mode is DI compare mode is CC extend is none
>> +(define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
>> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
>> +        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))
>> +   (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
>> +  [(set (match_dup 0) (match_dup 1))
>> +   (set (match_dup 2)
>> +        (compare:CC (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is DI result mode is DI compare mode is CCUNS extend is none
>> +(define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
>> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
>> +        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:DI 3 "const_0_to_1_operand" "n")))
>> +   (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "ld%X1 %0,%1\;cmpldi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"
>> +  [(set (match_dup 0) (match_dup 1))
>> +   (set (match_dup 2)
>> +        (compare:CCUNS (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is SI result mode is clobber compare mode is CC extend is none
>> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
>> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
>> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
>> +   (clobber (match_scratch:SI 0 "=r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
>> +  [(set (match_dup 0) (match_dup 1))
>> +   (set (match_dup 2)
>> +        (compare:CC (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is SI result mode is clobber compare mode is CCUNS extend is none
>> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none"
>> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
>> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
>> +   (clobber (match_scratch:SI 0 "=r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
>> +  [(set (match_dup 0) (match_dup 1))
>> +   (set (match_dup 2)
>> +        (compare:CCUNS (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is SI result mode is SI compare mode is CC extend is none
>> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
>> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
>> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
>> +   (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
>> +  [(set (match_dup 0) (match_dup 1))
>> +   (set (match_dup 2)
>> +        (compare:CC (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is SI result mode is SI compare mode is CCUNS extend is none
>> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none"
>> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
>> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
>> +   (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
>> +  [(set (match_dup 0) (match_dup 1))
>> +   (set (match_dup 2)
>> +        (compare:CCUNS (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
>> +(define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
>> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
>> +        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:SI 3 "const_m1_to_1_operand" "n")))
>> +   (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lwa%X1 %0,%1\;cmpdi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_DS))"
>> +  [(set (match_dup 0) (sign_extend:EXTSI (match_dup 1)))
>> +   (set (match_dup 2)
>> +        (compare:CC (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is SI result mode is EXTSI compare mode is CCUNS extend is zero
>> +(define_insn_and_split "*lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"
>> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
>> +        (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:SI 3 "const_0_to_1_operand" "n")))
>> +   (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (zero_extend:EXTSI (match_dup 1)))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lwz%X1 %0,%1\;cmpldi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), SImode, NON_PREFIXED_D))"
>> +  [(set (match_dup 0) (zero_extend:EXTSI (match_dup 1)))
>> +   (set (match_dup 2)
>> +        (compare:CCUNS (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is HI result mode is clobber compare mode is CC extend is sign
>> +(define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign"
>> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
>> +        (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:HI 3 "const_m1_to_1_operand" "n")))
>> +   (clobber (match_scratch:GPR 0 "=r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lha%X1 %0,%1\;cmpdi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
>> +  [(set (match_dup 0) (sign_extend:GPR (match_dup 1)))
>> +   (set (match_dup 2)
>> +        (compare:CC (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is HI result mode is clobber compare mode is CCUNS extend is zero
>> +(define_insn_and_split "*lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"
>> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
>> +        (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:HI 3 "const_0_to_1_operand" "n")))
>> +   (clobber (match_scratch:GPR 0 "=r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lhz%X1 %0,%1\;cmpldi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
>> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
>> +   (set (match_dup 2)
>> +        (compare:CCUNS (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is HI result mode is EXTHI compare mode is CC extend is sign
>> +(define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign"
>> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
>> +        (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:HI 3 "const_m1_to_1_operand" "n")))
>> +   (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (sign_extend:EXTHI (match_dup 1)))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lha%X1 %0,%1\;cmpdi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
>> +  [(set (match_dup 0) (sign_extend:EXTHI (match_dup 1)))
>> +   (set (match_dup 2)
>> +        (compare:CC (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is HI result mode is EXTHI compare mode is CCUNS extend is zero
>> +(define_insn_and_split "*lhz_cmpldi_cr0_HI_EXTHI_CCUNS_zero"
>> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
>> +        (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:HI 3 "const_0_to_1_operand" "n")))
>> +   (set (match_operand:EXTHI 0 "gpc_reg_operand" "=r") (zero_extend:EXTHI (match_dup 1)))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lhz%X1 %0,%1\;cmpldi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), HImode, NON_PREFIXED_D))"
>> +  [(set (match_dup 0) (zero_extend:EXTHI (match_dup 1)))
>> +   (set (match_dup 2)
>> +        (compare:CCUNS (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is QI result mode is clobber compare mode is CCUNS extend is zero
>> +(define_insn_and_split "*lbz_cmpldi_cr0_QI_clobber_CCUNS_zero"
>> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
>> +        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
>> +   (clobber (match_scratch:GPR 0 "=r"))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, NON_PREFIXED_D))"
>> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
>> +   (set (match_dup 2)
>> +        (compare:CCUNS (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
>> +;; load mode is QI result mode is GPR compare mode is CCUNS extend is zero
>> +(define_insn_and_split "*lbz_cmpldi_cr0_QI_GPR_CCUNS_zero"
>> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
>> +        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
>> +                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
>> +   (set (match_operand:GPR 0 "gpc_reg_operand" "=r") (zero_extend:GPR (match_dup 1)))]
>> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
>> +  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"
>> +  "&& reload_completed
>> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
>> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), QImode, NON_PREFIXED_D))"
>> +  [(set (match_dup 0) (zero_extend:GPR (match_dup 1)))
>> +   (set (match_dup 2)
>> +        (compare:CCUNS (match_dup 0)
>> +		    (match_dup 3)))]
>> +  ""
>> +  [(set_attr "type" "load")
>> +   (set_attr "cost" "8")
>> +   (set_attr "length" "8")])
>> +
>> diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
>> new file mode 100755
>> index 00000000000..494537c9439
>> --- /dev/null
>> +++ b/gcc/config/rs6000/genfusion.pl
>> @@ -0,0 +1,144 @@
>> +#!/usr/bin/perl -w
>> +# Generate fusion.md 
>> +# Copyright (C) 2020 Free Software Foundation, Inc.
>> +#
>> +# This file is part of GCC.
>> +#
>> +# GCC is free software; you can redistribute it and/or modify
>> +# it under the terms of the GNU General Public License as published by
>> +# the Free Software Foundation; either version 3, or (at your option)
>> +# any later version.
>> +#
>> +# GCC is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with GCC; see the file COPYING3.  If not see
>> +# <http://www.gnu.org/licenses/>.
>> +
>> +my $copyright =  <<'EOF';
>> +;; -*- buffer-read-only: t -*-
>> +;; Generated automatically by genfusion.pl
>> +
>> +;; Copyright (C) 2020 Free Software Foundation, Inc.
>> +;;
>> +;; This file is part of GCC.
>> +;;
>> +;; GCC is free software; you can redistribute it and/or modify it under
>> +;; the terms of the GNU General Public License as published by the Free
>> +;; Software Foundation; either version 3, or (at your option) any later
>> +;; version.
>> +;;
>> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +;; for more details.
>> +;;
>> +;; You should have received a copy of the GNU General Public License
>> +;; along with GCC; see the file COPYING3.  If not see
>> +;; <http://www.gnu.org/licenses/>.
>> +
>> +EOF
>> +
>> +print $copyright;
>> +
>> +sub mode_to_ldst_char
>> +{
>> +    my ($mode) = @_;
>> +    if ($mode eq 'DI') { return 'd'; }
>> +    if ($mode eq 'SI') { return 'w'; }
>> +    if ($mode eq 'HI') { return 'h'; }
>> +    if ($mode eq 'QI') { return 'b'; }
>> +    return '?';
>> +}
>> +
>> +sub gen_ld_cmpi_p10
>> +{
>> +  LMODE: foreach $lmode ('DI','SI','HI','QI') {
>> +      $ldst = mode_to_ldst_char($lmode);
>> +      $clobbermode = $lmode;
>> +      # For clobber, we need a SI/DI reg in case we split because we have to sign/zero extend.
>> +      if ( $lmode eq 'HI' || $lmode eq 'QI' ) { $clobbermode = "GPR"; }
>> +    RESULT: foreach $result ('clobber', $lmode,  "EXT".$lmode) {
>> +	# EXTDI does not exist, and we cannot directly produce HI/QI results.
>> +	next RESULT if $result eq "EXTDI" || $result eq "HI" || $result eq "QI";
>> +	# Don't allow EXTQI because that would allow HI result which we can't do.
>> +	if ( $result eq "EXTQI" ) { $result = "GPR"; }
>> +      CCMODE: foreach $ccmode ('CC','CCUNS') {
>> +	  $np = "NON_PREFIXED_D";
>> +	  if ( $ccmode eq 'CC' ) {
>> +	      next CCMODE if $lmode eq 'QI';
>> +	      if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
>> +		  # ld and lwa are both DS-FORM.
>> +		  $np = "NON_PREFIXED_DS";
>> +	      }
>> +	      $cmpl = "";
>> +	      $echr = "a";
>> +	      $constpred = "const_m1_to_1_operand";
>> +	  } else {
>> +	      if ( $lmode eq 'DI' ) {
>> +		  # ld is DS-form, but lwz is not.
>> +		  $np = "NON_PREFIXED_DS";
>> +	      }
>> +	      $cmpl = "l";
>> +	      $echr = "z";
>> +	      $constpred = "const_0_to_1_operand";
>> +	  }
>> +	  if ($lmode eq 'DI') { $echr = ""; }
>> +	  if ($result =~ m/EXT/ || $result eq 'GPR' || $clobbermode eq 'GPR') {
>> +	      # We always need extension if result > lmode.
>> +	      if ( $ccmode eq 'CC' ) {
>> +		  $extend = "sign";
>> +	      } else {
>> +		  $extend = "zero";
>> +	      }
>> +	  } else {
>> +	      # Result of SI/DI does not need sign extension.
>> +	      $extend = "none";
>> +	  }
>> +	  print ";; load-cmpi fusion pattern generated by gen_ld_cmpi_p10\n";
>> +	  print ";; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend\n";
>> +
>> +	  print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";
>> +	  print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n";
>> +	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"non_update_memory_operand\" \"m\")\n";
>> +	  print "                 (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n";
>> +	  if ($result eq 'clobber') {
>> +	      print "   (clobber (match_scratch:${clobbermode} 0 \"=r\"))]\n";
>> +	  } elsif ($result eq $lmode) {
>> +	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (match_dup 1))]\n";
>> +	  } else {
>> +	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (${extend}_extend:${result} (match_dup 1)))]\n";
>> +	  }
>> +	  print "  \"(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)\"\n";
>> +	  print "  \"l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di 0,%0,%3\"\n";
>> +	  print "  \"&& reload_completed\n";
>> +	  print "   && (cc_reg_not_cr0_operand (operands[2], CCmode)\n";
>> +	  print "       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), ${lmode}mode, ${np}))\"\n";
>> +	  if ($extend eq "none") {
>> +	      print "  [(set (match_dup 0) (match_dup 1))\n";
>> +	  } else {
>> +	      $resultmode = $result;
>> +	      if ( $result eq 'clobber' ) { $resultmode = $clobbermode }
>> +	      print "  [(set (match_dup 0) (${extend}_extend:${resultmode} (match_dup 1)))\n";
>> +	  }
>> +	  print "   (set (match_dup 2)\n";
>> +	  print "        (compare:${ccmode} (match_dup 0)\n";
>> +	  print "		    (match_dup 3)))]\n";
>> +	  print "  \"\"\n";
>> +	  print "  [(set_attr \"type\" \"load\")\n";
>> +	  print "   (set_attr \"cost\" \"8\")\n";
>> +	  print "   (set_attr \"length\" \"8\")])\n";
>> +	  print "\n";
>> +      }
>> +    }
>> +  }
>> +}
>> +
>> +
>> +gen_ld_cmpi_p10();
>> +
>> +exit(0);
>> +
>> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
>> index 9ad5ae67302..78de8102f44 100644
>> --- a/gcc/config/rs6000/predicates.md
>> +++ b/gcc/config/rs6000/predicates.md
>> @@ -297,6 +297,11 @@ (define_predicate "const_0_to_1_operand"
>>  (and (match_code "const_int")
>>       (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
>> 
>> +;; Match op = -1, op = 0, or op = 1.
>> +(define_predicate "const_m1_to_1_operand"
>> +  (and (match_code "const_int")
>> +       (match_test "IN_RANGE (INTVAL (op), -1, 1)")))
>> +
>> ;; Match op = 0..3.
>> (define_predicate "const_0_to_3_operand"
>>  (and (match_code "const_int")
>> @@ -847,6 +852,15 @@ (define_special_predicate "update_address_mem"
>> 		    || GET_CODE (XEXP (op, 0)) == PRE_DEC
>> 		    || GET_CODE (XEXP (op, 0)) == PRE_MODIFY))"))
>> 
>> +;; Anything that matches memory_operand but does not update the address.
>> +(define_predicate "non_update_memory_operand"
>> +  (match_code "mem")
>> +{
>> +  if (update_address_mem (op, mode))
>> +    return 0;
>> +  return memory_operand (op, mode);
>> +})
>> +
>> ;; Return 1 if the operand is a MEM with an indexed-form address.
>> (define_special_predicate "indexed_address_mem"
>>  (match_test "(MEM_P (op)
>> diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
>> index 8d2c1ffd6cf..3e65289d8df 100644
>> --- a/gcc/config/rs6000/rs6000-cpus.def
>> +++ b/gcc/config/rs6000/rs6000-cpus.def
>> @@ -82,7 +82,9 @@
>> 
>> #define ISA_3_1_MASKS_SERVER	(ISA_3_0_MASKS_SERVER			\
>> 				 | OPTION_MASK_POWER10			\
>> -				 | OTHER_POWER10_MASKS)
>> +				 | OTHER_POWER10_MASKS			\
>> +				 | OPTION_MASK_P10_FUSION		\
>> +				 | OPTION_MASK_P10_FUSION_LD_CMPI)
>> 
>> /* Flags that need to be turned off if -mno-power9-vector.  */
>> #define OTHER_P9_VECTOR_MASKS	(OPTION_MASK_FLOAT128_HW		\
>> @@ -129,6 +131,8 @@
>> 				 | OPTION_MASK_FLOAT128_KEYWORD		\
>> 				 | OPTION_MASK_FPRND			\
>> 				 | OPTION_MASK_POWER10			\
>> +				 | OPTION_MASK_P10_FUSION		\
>> +				 | OPTION_MASK_P10_FUSION_LD_CMPI	\
>> 				 | OPTION_MASK_HTM			\
>> 				 | OPTION_MASK_ISEL			\
>> 				 | OPTION_MASK_MFCRF			\
>> diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h
>> index 3c4682b0e26..cd644083558 100644
>> --- a/gcc/config/rs6000/rs6000-protos.h
>> +++ b/gcc/config/rs6000/rs6000-protos.h
>> @@ -191,6 +191,8 @@ enum non_prefixed_form {
>> 
>> extern enum insn_form address_to_insn_form (rtx, machine_mode,
>> 					    enum non_prefixed_form);
>> +extern bool address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
>> +				       enum non_prefixed_form non_prefix_format);
>> extern bool prefixed_load_p (rtx_insn *);
>> extern bool prefixed_store_p (rtx_insn *);
>> extern bool prefixed_paddi_p (rtx_insn *);
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index 517467ebc63..759551d07ec 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -4423,6 +4423,12 @@ rs6000_option_override_internal (bool global_init_p)
>>  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0)
>>    rs6000_isa_flags |= OPTION_MASK_MMA;
>> 
>> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
>> +    rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
>> +
>> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0)
>> +    rs6000_isa_flags |= OPTION_MASK_P10_FUSION_LD_CMPI;
>> +
>>  /* Turn off vector pair/mma options on non-power10 systems.  */
>>  else if (!TARGET_POWER10 && TARGET_MMA)
>>    {
>> @@ -23614,6 +23620,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
>>  { "power9-minmax",		OPTION_MASK_P9_MINMAX,		false, true  },
>>  { "power9-misc",		OPTION_MASK_P9_MISC,		false, true  },
>>  { "power9-vector",		OPTION_MASK_P9_VECTOR,		false, true  },
>> +  { "power10-fusion",		OPTION_MASK_P10_FUSION,		false, true  },
>>  { "powerpc-gfxopt",		OPTION_MASK_PPC_GFXOPT,		false, true  },
>>  { "powerpc-gpopt",		OPTION_MASK_PPC_GPOPT,		false, true  },
>>  { "prefixed",			OPTION_MASK_PREFIXED,		false, true  },
>> @@ -25705,6 +25712,50 @@ address_to_insn_form (rtx addr,
>>  return INSN_FORM_BAD;
>> }
>> 
>> +/* Given address rtx ADDR for a load of MODE, is this legitimate for a
>> +   non-prefixed D-form or X-form instruction?  NON_PREFIXED_FORMAT is
>> +   given NON_PREFIXED_D or NON_PREFIXED_DS to indicate whether we want
>> +   a D-form or DS-form instruction.  X-form and base_reg are always
>> +   allowed.  */
>> +bool
>> +address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
>> +			   enum non_prefixed_form non_prefixed_format)
>> +{
>> +  enum insn_form result_form;
>> +
>> +  result_form = address_to_insn_form (addr, mode, non_prefixed_format);
>> +
>> +  switch (non_prefixed_format)
>> +    {
>> +    case NON_PREFIXED_D:
>> +      switch (result_form)
>> +	{
>> +	case INSN_FORM_X:
>> +	case INSN_FORM_D:
>> +	case INSN_FORM_DS:
>> +	case INSN_FORM_BASE_REG:
>> +	  return true;
>> +	default:
>> +	  break;
>> +	}
>> +      break;
>> +    case NON_PREFIXED_DS:
>> +      switch (result_form)
>> +	{
>> +	case INSN_FORM_X:
>> +	case INSN_FORM_DS:
>> +	case INSN_FORM_BASE_REG:
>> +	  return true;
>> +	default:
>> +	  break;
>> +	}
>> +      break;
>> +    default:
>> +      break;
>> +    }
>> +  return false;
>> +}
>> +
>> /* Helper function to see if we're potentially looking at lfs/stfs.
>>   - PARALLEL containing a SET and a CLOBBER
>>   - stfs:
>> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
>> index 5bf9c83fc1e..307c0b200bd 100644
>> --- a/gcc/config/rs6000/rs6000.h
>> +++ b/gcc/config/rs6000/rs6000.h
>> @@ -539,6 +539,7 @@ extern int rs6000_vector_align[];
>> #define MASK_UPDATE			OPTION_MASK_UPDATE
>> #define MASK_VSX			OPTION_MASK_VSX
>> #define MASK_POWER10			OPTION_MASK_POWER10
>> +#define MASK_P10_FUSION			OPTION_MASK_P10_FUSION
>> 
>> #ifndef IN_LIBGCC2
>> #define MASK_POWERPC64			OPTION_MASK_POWERPC64
>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
>> index b89990f46bf..c39b7098978 100644
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -14926,3 +14926,4 @@ (define_insn "*cmpeqb_internal"
>> (include "dfp.md")
>> (include "crypto.md")
>> (include "htm.md")
>> +(include "fusion.md")
>> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
>> index 2888172cb27..008a318b98d 100644
>> --- a/gcc/config/rs6000/rs6000.opt
>> +++ b/gcc/config/rs6000/rs6000.opt
>> @@ -479,6 +479,14 @@ mpower8-vector
>> Target Report Mask(P8_VECTOR) Var(rs6000_isa_flags)
>> Use vector and scalar instructions added in ISA 2.07.
>> 
>> +mpower10-fusion
>> +Target Report Mask(P10_FUSION) Var(rs6000_isa_flags)
>> +Fuse certain integer operations together for better performance on power10.
>> +
>> +mpower10-fusion-ld-cmpi
>> +Target Undocumented Mask(P10_FUSION_LD_CMPI) Var(rs6000_isa_flags)
>> +Fuse certain integer operations together for better performance on power10.
>> +
>> mcrypto
>> Target Report Mask(CRYPTO) Var(rs6000_isa_flags)
>> Use ISA 2.07 Category:Vector.AES and Category:Vector.SHA2 instructions.
>> diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
>> index 1ddb5729cb2..bcc71a9e21b 100644
>> --- a/gcc/config/rs6000/t-rs6000
>> +++ b/gcc/config/rs6000/t-rs6000
>> @@ -47,6 +47,9 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
>> 	$(COMPILE) $<
>> 	$(POSTCOMPILE)
>> 
>> +$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
>> +	$(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md
>> +
>> $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \
>>  $(srcdir)/config/rs6000/rs6000-cpus.def
>> 	$(SHELL) $(srcdir)/config/rs6000/genopt.sh $(srcdir)/config/rs6000 > \
>> @@ -86,4 +89,5 @@ MD_INCLUDES = $(srcdir)/config/rs6000/rs64.md \
>> 	$(srcdir)/config/rs6000/mma.md \
>> 	$(srcdir)/config/rs6000/crypto.md \
>> 	$(srcdir)/config/rs6000/htm.md \
>> -	$(srcdir)/config/rs6000/dfp.md
>> +	$(srcdir)/config/rs6000/dfp.md \
>> +	$(srcdir)/config/rs6000/fusion.md
>> -- 
>> 2.27.0
>> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion
  2020-12-04 19:19 [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion acsawdey
                   ` (2 preceding siblings ...)
  2021-01-03 20:42 ` Aaron Sawdey
@ 2021-01-26  0:51 ` Segher Boessenkool
  3 siblings, 0 replies; 7+ messages in thread
From: Segher Boessenkool @ 2021-01-26  0:51 UTC (permalink / raw)
  To: acsawdey; +Cc: gcc-patches, wschmidt

Hi!

On Fri, Dec 04, 2020 at 01:19:11PM -0600, acsawdey@linux.ibm.com wrote:
> This patch adds the first batch of patterns to support p10 fusion. These
> will allow combine to create a single insn for a pair of instructions
> that that power10 can fuse and execute. These particular ones have the
> requirement that only cr0 can be used when fusing a load with a compare
> immediate of -1/0/1 (if signed) or 0/1 (if unsigned), so we want combine
> to put that requirement in, and if it doesn't work out later the splitter
> can get used.

> The patterns are generated by a script genfusion.pl and live in new file
> fusion.md. This script will be expanded to generate more patterns for
> fusion.

> This also adds option -mpower10-fusion which defaults on for power10 and
> will gate all these fusion patterns. In addition I have added an
> undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
> that just controls the load+compare-immediate patterns. I have make
> these default on for power10 but they are not disallowed for earlier
> processors because it is still valid code. This allows us to test the
> correctness of fusion code generation by turning it on explicitly.

> 	* config/rs6000/genfusion.pl: New file, script to generate
> 	define_insn_and_split patterns so combine can arrange fused
> 	instructions next to each other.
> 	* config/rs6000/fusion.md: New file, generated fused instruction
> 	patterns for combine.

So this script is never run by any target, you have to do it all
manually?  Okay.

> 	* config/rs6000/predicates.md (const_m1_to_1_operand): New predicate.
> 	(non_update_memory_operand): New predicate.
> 	* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
> 	OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER and
> 	POWERPC_MASKS.
> 	* config/rs6000/rs6000-protos.h (address_is_non_pfx_d_or_x): Add
> 	prototype.
> 	* config/rs6000/rs6000.c (rs6000_option_override_internal):
> 	automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
>  	if target is power10.

Capital A.  And, you do not set any "-m", you set OPTION_MASK_P10_FUSION,
instead.

Every new entry starts on a new line:

	(rs600_opt_masks): Allow -mpower10-fusion in function attributes.
	(address_is_non_pfx_d_or_x): New function.

> 	* config/rs6000/rs6000.h: Add MASK_P10_FUSION.
> 	* config/rs6000/rs6000.md: Include fusion.md.
> 	* config/rs6000/rs6000.opt: Add -mpower10-fusion
> 	and -mpower10-fusion-ld-cmpi.
> 	* config/rs6000/t-rs6000: Add dependencies involving fusion.md.

> --- /dev/null
> +++ b/gcc/config/rs6000/fusion.md
> @@ -0,0 +1,357 @@
> +;; -*- buffer-read-only: t -*-

Don't do these things please.  You can make the file r/o if you really
want to, that works for all (sane) editors (and works fine in Git).

> +;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> +;; load mode is DI result mode is clobber compare mode is CC extend is none
> +(define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
> +  [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> +        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "m")
> +                 (match_operand:DI 3 "const_m1_to_1_operand" "n")))

The indent here is wrong?  Just use all spaces, that is fine, and easier
than getting tabs right in a generated file.

> +   (clobber (match_scratch:DI 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"

I would make TARGET_P10_FUSION_LD_CMPI imply TARGET_P10_FUSION, that
makes all further code simpler, no?

> +  "ld%X1 %0,%1\;cmpdi 0,%0,%3"
> +  "&& reload_completed
> +   && (cc_reg_not_cr0_operand (operands[2], CCmode)
> +       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), DImode, NON_PREFIXED_DS))"

Space after comma.  Line too long (I think you can easily break it in the
source?  If not, no man overboard :-) )

> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2)
> +        (compare:CC (match_dup 0)
> +		    (match_dup 3)))]

(Here, the line before where you use a tab should have one, already.)

If you write the first arm on one line, than why not this one (or at
least start the rhs on the same line as the lhs, etc.)

> +  [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> +        (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m")
> +                 (match_operand:QI 3 "const_0_to_1_operand" "n")))
> +   (clobber (match_scratch:GPR 0 "=r"))]
> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)"
> +  "lbz%X1 %0,%1\;cmpldi 0,%0,%3"

This should use
  cmpldi %2,%0,%3
(not all assemblers allow bare "0", some require "cr0").

> --- /dev/null
> +++ b/gcc/config/rs6000/genfusion.pl
> @@ -0,0 +1,144 @@
> +#!/usr/bin/perl -w

Since Perl 5.6, you can use
  use warnings;
instead of -w.

Also:
  use strict;
(which will catch a few errors you made; nothing super serious, but
still).

> +# Generate fusion.md 

Trailing space.  Sentences end with a full stop; if you do not like that
here, maybe just add an empty line after it so it is clear it is a
heading?  Not a terrible idea in the first place :-)

> +# Copyright (C) 2020 Free Software Foundation, Inc.

Don't forget to update the year :-)

> +my $copyright =  <<'EOF';
> +;; -*- buffer-read-only: t -*-
> +;; Generated automatically by genfusion.pl
> +
> +;; Copyright (C) 2020 Free Software Foundation, Inc.
> +;;
> +;; This file is part of GCC.
> +;;
> +;; GCC is free software; you can redistribute it and/or modify it under
> +;; the terms of the GNU General Public License as published by the Free
> +;; Software Foundation; either version 3, or (at your option) any later
> +;; version.
> +;;
> +;; GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +;; WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +;; for more details.
> +;;
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; <http://www.gnu.org/licenses/>.
> +
> +EOF
> +
> +print $copyright;

You can just

print <<'EOF';
bla
bla
bla
EOF

> +sub mode_to_ldst_char
> +{
> +    my ($mode) = @_;
> +    if ($mode eq 'DI') { return 'd'; }
> +    if ($mode eq 'SI') { return 'w'; }
> +    if ($mode eq 'HI') { return 'h'; }
> +    if ($mode eq 'QI') { return 'b'; }
> +    return '?';
> +}

That invites using a hash:
  my %x = (DI => 'd', SI => 'w', HI => 'h', QI => 'b');
(or even  %x = qw(DI d  SI w  HI h  QI b);  )
  return $x{$mode} if exists $x{$mode};
  return '?';

> +sub gen_ld_cmpi_p10
> +{
> +  LMODE: foreach $lmode ('DI','SI','HI','QI') {

Here, you haven't declared $lmode; strict will error about that (use
"my $lmode").

> +      $ldst = mode_to_ldst_char($lmode);
> +      $clobbermode = $lmode;
> +      # For clobber, we need a SI/DI reg in case we split because we have to sign/zero extend.

(line too long)

> +      if ( $lmode eq 'HI' || $lmode eq 'QI' ) { $clobbermode = "GPR"; }
> +    RESULT: foreach $result ('clobber', $lmode,  "EXT".$lmode) {
> +	# EXTDI does not exist, and we cannot directly produce HI/QI results.
> +	next RESULT if $result eq "EXTDI" || $result eq "HI" || $result eq "QI";
> +	# Don't allow EXTQI because that would allow HI result which we can't do.
> +	if ( $result eq "EXTQI" ) { $result = "GPR"; }

No spaces inside parens please.

Other ways to write this are
  $result eq "EXTQI" and $result = "GPR";
and
  $result = "GPR" if $result eq "EXTQI";
both of which are a bit easier to read.

> +      CCMODE: foreach $ccmode ('CC','CCUNS') {
> +	  $np = "NON_PREFIXED_D";
> +	  if ( $ccmode eq 'CC' ) {
> +	      next CCMODE if $lmode eq 'QI';
> +	      if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
> +		  # ld and lwa are both DS-FORM.
> +		  $np = "NON_PREFIXED_DS";
> +	      }
> +	      $cmpl = "";
> +	      $echr = "a";
> +	      $constpred = "const_m1_to_1_operand";
> +	  } else {
> +	      if ( $lmode eq 'DI' ) {
> +		  # ld is DS-form, but lwz is not.
> +		  $np = "NON_PREFIXED_DS";
> +	      }
> +	      $cmpl = "l";
> +	      $echr = "z";
> +	      $constpred = "const_0_to_1_operand";
> +	  }
> +	  if ($lmode eq 'DI') { $echr = ""; }
> +	  if ($result =~ m/EXT/ || $result eq 'GPR' || $clobbermode eq 'GPR') {

Please use a bit stricter regexp (maybe /^EXT/), too lax regexps will
always surprise you what all they can match, in the end :-)

> +	      # We always need extension if result > lmode.
> +	      if ( $ccmode eq 'CC' ) {
> +		  $extend = "sign";
> +	      } else {
> +		  $extend = "zero";
> +	      }
> +	  } else {
> +	      # Result of SI/DI does not need sign extension.
> +	      $extend = "none";
> +	  }
> +	  print ";; load-cmpi fusion pattern generated by gen_ld_cmpi_p10\n";
> +	  print ";; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend\n";
> +
> +	  print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";

You can use qq// if you need to include double quotes in a double-quoted
string, like
  print qq(Oh look, "some quotes", isn't it great!\n);
or
  print qq/Oh look, "some quotes", isn't it great!\n/;
or pretty much whatever other character you want as separator :-)

> +	  print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n";
> +	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"non_update_memory_operand\" \"m\")\n";
> +	  print "                 (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n";
> +	  if ($result eq 'clobber') {
> +	      print "   (clobber (match_scratch:${clobbermode} 0 \"=r\"))]\n";
> +	  } elsif ($result eq $lmode) {
> +	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (match_dup 1))]\n";
> +	  } else {
> +	      print "   (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (${extend}_extend:${result} (match_dup 1)))]\n";
> +	  }
> +	  print "  \"(TARGET_P10_FUSION && TARGET_P10_FUSION_LD_CMPI)\"\n";
> +	  print "  \"l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di 0,%0,%3\"\n";

s/ 0/ %2/

> +	  print "  \"&& reload_completed\n";
> +	  print "   && (cc_reg_not_cr0_operand (operands[2], CCmode)\n";
> +	  print "       || !address_is_non_pfx_d_or_x (XEXP (operands[1],0), ${lmode}mode, ${np}))\"\n";

	  print "       || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),\n";
	  print "                                      ${lmode}mode, ${np}))\"\n";

or something like that.

> +	  if ($extend eq "none") {
> +	      print "  [(set (match_dup 0) (match_dup 1))\n";
> +	  } else {
> +	      $resultmode = $result;
> +	      if ( $result eq 'clobber' ) { $resultmode = $clobbermode }
> +	      print "  [(set (match_dup 0) (${extend}_extend:${resultmode} (match_dup 1)))\n";
> +	  }
> +	  print "   (set (match_dup 2)\n";
> +	  print "        (compare:${ccmode} (match_dup 0)\n";
> +	  print "		    (match_dup 3)))]\n";
> +	  print "  \"\"\n";
> +	  print "  [(set_attr \"type\" \"load\")\n";
> +	  print "   (set_attr \"cost\" \"8\")\n";
> +	  print "   (set_attr \"length\" \"8\")])\n";
> +	  print "\n";

You can also use a here-document (<<) for long prints (you can
interpolate variables in that just fine if you use <<"HERE", i.e.
double-quote the terminator string).

Anyway, the only thing you really need to improve in the Perl code now
is "use strict;".  The rest you can do later :-)

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4423,6 +4423,12 @@ rs6000_option_override_internal (bool global_init_p)
>    if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0)
>      rs6000_isa_flags |= OPTION_MASK_MMA;
>  
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
> +    rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
> +
> +  if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0)

  if (TARGET_POWER10
      && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0)

> +bool
> +address_is_non_pfx_d_or_x (rtx addr, machine_mode mode,
> +			   enum non_prefixed_form non_prefixed_format)
> +{
> +  enum insn_form result_form;
> +
> +  result_form = address_to_insn_form (addr, mode, non_prefixed_format);
> +
> +  switch (non_prefixed_format)
> +    {
> +    case NON_PREFIXED_D:
> +      switch (result_form)
> +	{
> +	case INSN_FORM_X:
> +	case INSN_FORM_D:
> +	case INSN_FORM_DS:
> +	case INSN_FORM_BASE_REG:
> +	  return true;
> +	default:
> +	  break;
> +	}

"default: break;" always is superfluous.  Also, please "return false;"
everywhere you do "break" to just get there.

> --- a/gcc/config/rs6000/t-rs6000
> +++ b/gcc/config/rs6000/t-rs6000
> @@ -47,6 +47,9 @@ rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
>  	$(COMPILE) $<
>  	$(POSTCOMPILE)
>  
> +$(srcdir)/config/rs6000/fusion.md: $(srcdir)/config/rs6000/genfusion.pl
> +	$(srcdir)/config/rs6000/genfusion.pl > $(srcdir)/config/rs6000/fusion.md

Ah, so you *do* generate it always.  Hrm, I'm not sure I like that,
certainly not now.  Comment out this line, and maybe enable it again in
stage 1?

Okay for trunk with those things taken into account.  Thank you!


Segher

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-01-26  0:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-04 19:19 [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion acsawdey
2020-12-07 20:48 ` will schmidt
2020-12-21 18:11 ` Pat Haugen
2020-12-21 22:48   ` Segher Boessenkool
2021-01-03 20:42 ` Aaron Sawdey
2021-01-19  4:43   ` Aaron Sawdey
2021-01-26  0:51 ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).