public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Add a Fourth parameter for prefetch and Support Intel PREFETCHI
@ 2022-10-14  8:34 Haochen Jiang
  2022-10-14  8:34 ` [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM Haochen Jiang
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Haochen Jiang @ 2022-10-14  8:34 UTC (permalink / raw)
  To: gcc-patches
  Cc: rguenther, hongtao.liu, ubizjak, richard.earnshaw,
	richard.sandiford, marcus.shawcroft, kyrylo.tkachov, rth, gnu,
	claziss, nickc, ramana.radhakrishnan, aoliva, hubicka, mfortune,
	dje.gcc, segher, linkw, uweigand, krebbel, olegendo, davem,
	ebotcazou, jeffreyalaw, dave.anglin

Hi all,

Sorry for the previous cover-letter stucking and disturbance and this
is the right cover letter.

These two patches aimed to add Intel PREFETCHI.

The information is based on newly released
Intel Architecture Instruction Set Extensions and Future Features.

The document comes following:
https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

The first patch added a fourth parameter for prefetch to align with LLVM
in middle end. Currently LLVM had a fourth parameter to indicate what is
prefetching. Also added a warning on backends that does not support
instruction prefetch in machine description file to tell users attempting
using prefetchi that the backend will change it to data prefetch.

The second patch was i386 specific and added PREFETCHI to i386.

Regtested on x86_64-pc-linux-gnu and cross-compiled to other backends.
For other backends, I ran through the compile test and no regressions found.
Since I did not have machines from other backends, could you kindly help
me to test with other machines? I suppose there should not have regressions
since I just added a warning to the md file and corresponding testcase.

Ok for trunk?

BRs,
Haochen




^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-14  8:34 [PATCH 0/2] Add a Fourth parameter for prefetch and Support Intel PREFETCHI Haochen Jiang
@ 2022-10-14  8:34 ` Haochen Jiang
  2022-10-14  8:46   ` Hongtao Liu
                     ` (3 more replies)
  2022-10-14  8:34 ` [PATCH 2/2] Support Intel prefetchit0/t1 Haochen Jiang
  2022-10-19 20:49 ` [PATCH 0/2] Add a Fourth parameter for prefetch and Support Intel PREFETCHI Segher Boessenkool
  2 siblings, 4 replies; 26+ messages in thread
From: Haochen Jiang @ 2022-10-14  8:34 UTC (permalink / raw)
  To: gcc-patches
  Cc: rguenther, hongtao.liu, ubizjak, richard.earnshaw,
	richard.sandiford, marcus.shawcroft, kyrylo.tkachov, rth, gnu,
	claziss, nickc, ramana.radhakrishnan, aoliva, hubicka, mfortune,
	dje.gcc, segher, linkw, uweigand, krebbel, olegendo, davem,
	ebotcazou, jeffreyalaw, dave.anglin

gcc/ChangeLog:

	* builtins.cc (expand_builtin_prefetch): Handle the fourth parameter in
	expand function.
	* config/aarch64/aarch64-sve.md: Add default parameter value.
	* config/aarch64/aarch64.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/alpha/alpha.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/arc/arc.md: Add default parameter value.
	* config/arm/arm.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/frv/frv.md: Ditto.
	* config/i386/i386.md: Ditto.
	* config/ia64/ia64.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/mips/mips.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/pa/pa.md: Ditto.
	* config/rs6000/rs6000.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/s390/s390.cc (s390_expand_cpymem): Generate fourth parameter for
	gen_prefetch call.
	(s390_expand_setmem): Ditto.
	(s390_expand_cmpmem): Ditto.
	* config/s390/s390.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/sh/sh.md: Ditto.
	* config/sparc/sparc.md: Ditto.
	* doc/rtl.texi: Document cache variable for prefetch.
	* rtl.def (PREFETCH): Change prefetch DEF_RTL_EXPR to add fourth parameter.
	* rtlanal.cc (setup_reg_subrtx_bounds): Change gcc_checking_assert for
	fourth parameter.
	* target-insns.def (prefetch): Add fourth rtx for prefetch.

gcc/testsuite/ChangeLog:

	* gcc.c-torture/execute/builtin-prefetch-1.c: Add fourth parameter for
	testcases.
	* gcc.c-torture/execute/builtin-prefetch-2.c: Ditto.
	* gcc.c-torture/execute/builtin-prefetch-3.c: Ditto.
	* gcc.c-torture/execute/builtin-prefetch-4.c: Ditto.
	* gcc.c-torture/execute/builtin-prefetch-5.c: Ditto.
	* gcc.c-torture/execute/builtin-prefetch-6.c: Ditto.
	* gcc.dg/builtin-prefetch-1.c: Ditto.
	* gcc.misc-tests/i386-pf-3dnow-1.c: Ditto.
	* gcc.misc-tests/i386-pf-athlon-1.c: Ditto.
	* gcc.misc-tests/i386-pf-none-1.c: Ditto.
	* gcc.misc-tests/i386-pf-sse-1.c: Ditto.
	* gcc.target/i386/avx-1.c: Change prefetch macro define to variable args.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/aarch64/prefetchi-1.c: New test.
	* gcc.target/alpha/prefetchi-1.c: Ditto.
	* gcc.target/arc/prefetchi-1.c: Ditto.
	* gcc.target/arm/prefetchi-1.c: Ditto.
	* gcc.target/hppa/prefetchi-1.c: Ditto.
	* gcc.target/i386/prefetchi-1.c: Ditto.
	* gcc.target/ia64/prefetchi-1.c: Ditto.
	* gcc.target/mips/prefetchi-1.c: Ditto.
	* gcc.target/powerpc/prefetchi-1.c: Ditto.
	* gcc.target/s390/prefetchi-1.c: Ditto.
	* gcc.target/sh/prefetchi-1.c: Ditto.
	* gcc.target/sparc/prefetchi-1.c: Ditto.
---
 gcc/builtins.cc                               |  34 ++++--
 gcc/config/aarch64/aarch64-sve.md             |  15 ++-
 gcc/config/aarch64/aarch64.md                 |  19 +++-
 gcc/config/alpha/alpha.md                     |  19 +++-
 gcc/config/arc/arc.md                         |  20 +++-
 gcc/config/arm/arm.md                         |  19 +++-
 gcc/config/frv/frv.md                         |   6 +-
 gcc/config/i386/i386.md                       |  17 ++-
 gcc/config/ia64/ia64.md                       |  19 +++-
 gcc/config/mips/mips.md                       |  22 +++-
 gcc/config/pa/pa.md                           |  12 +-
 gcc/config/rs6000/rs6000.md                   |  19 +++-
 gcc/config/s390/s390.cc                       |  10 +-
 gcc/config/s390/s390.md                       |  19 +++-
 gcc/config/sh/sh.md                           |  15 ++-
 gcc/config/sparc/sparc.md                     |  15 ++-
 gcc/doc/rtl.texi                              |   6 +-
 gcc/rtl.def                                   |   5 +-
 gcc/rtlanal.cc                                |   2 +-
 gcc/target-insns.def                          |   2 +-
 .../execute/builtin-prefetch-1.c              |  45 ++++----
 .../execute/builtin-prefetch-2.c              | 106 +++++++++---------
 .../execute/builtin-prefetch-3.c              |  92 +++++++--------
 .../execute/builtin-prefetch-4.c              |  44 ++++----
 .../execute/builtin-prefetch-5.c              |  12 +-
 .../execute/builtin-prefetch-6.c              |   4 +-
 gcc/testsuite/gcc.dg/builtin-prefetch-1.c     |   5 +-
 .../gcc.misc-tests/i386-pf-3dnow-1.c          |  16 +--
 .../gcc.misc-tests/i386-pf-athlon-1.c         |  16 +--
 gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c |  16 +--
 gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c  |  16 +--
 .../gcc.target/aarch64/prefetchi-1.c          |  11 ++
 gcc/testsuite/gcc.target/alpha/prefetchi-1.c  |  11 ++
 gcc/testsuite/gcc.target/arc/prefetchi-1.c    |  11 ++
 gcc/testsuite/gcc.target/arm/prefetchi-1.c    |  11 ++
 gcc/testsuite/gcc.target/hppa/prefetchi-1.c   |  11 ++
 gcc/testsuite/gcc.target/i386/avx-1.c         |   2 +-
 gcc/testsuite/gcc.target/i386/prefetchi-1.c   |  11 ++
 gcc/testsuite/gcc.target/i386/sse-13.c        |   2 +-
 gcc/testsuite/gcc.target/i386/sse-23.c        |   2 +-
 gcc/testsuite/gcc.target/ia64/prefetchi-1.c   |  11 ++
 gcc/testsuite/gcc.target/mips/prefetchi-1.c   |  11 ++
 .../gcc.target/powerpc/prefetchi-1.c          |  11 ++
 gcc/testsuite/gcc.target/s390/prefetchi-1.c   |  11 ++
 gcc/testsuite/gcc.target/sh/prefetchi-1.c     |  11 ++
 gcc/testsuite/gcc.target/sparc/prefetchi-1.c  |  11 ++
 46 files changed, 564 insertions(+), 241 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/alpha/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/arc/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/hppa/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/ia64/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/mips/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/sh/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/prefetchi-1.c

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 5f319b28030..2e6d0c76beb 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -1282,18 +1282,18 @@ expand_builtin_update_setjmp_buf (rtx buf_addr)
 static void
 expand_builtin_prefetch (tree exp)
 {
-  tree arg0, arg1, arg2;
+  tree arg0, arg1, arg2, arg3;
   int nargs;
-  rtx op0, op1, op2;
+  rtx op0, op1, op2, op3;
 
   if (!validate_arglist (exp, POINTER_TYPE, 0))
     return;
 
   arg0 = CALL_EXPR_ARG (exp, 0);
 
-  /* Arguments 1 and 2 are optional; argument 1 (read/write) defaults to
-     zero (read) and argument 2 (locality) defaults to 3 (high degree of
-     locality).  */
+  /* Arguments 1, 2, 3 are optional; argument 1 (read/write) defaults to
+     zero (read); argument 2 (locality) defaults to 3 (high degree of
+     locality); argument 3 (cache type) defaults to 1 (data).  */
   nargs = call_expr_nargs (exp);
   if (nargs > 1)
     arg1 = CALL_EXPR_ARG (exp, 1);
@@ -1303,6 +1303,10 @@ expand_builtin_prefetch (tree exp)
     arg2 = CALL_EXPR_ARG (exp, 2);
   else
     arg2 = integer_three_node;
+  if (nargs > 3)
+    arg3 = CALL_EXPR_ARG (exp, 3);
+  else
+    arg3 = integer_one_node;
 
   /* Argument 0 is an address.  */
   op0 = expand_expr (arg0, NULL_RTX, Pmode, EXPAND_NORMAL);
@@ -1336,14 +1340,30 @@ expand_builtin_prefetch (tree exp)
       op2 = const0_rtx;
     }
 
+  /* Argument 3 (cache type) must be a compile-time constant int.  */
+  if (TREE_CODE (arg3) != INTEGER_CST)
+    {
+      error ("fourth argument to %<__builtin_prefetch%> must be a constant");
+      arg3 = integer_one_node;
+    }
+  op3 = expand_normal (arg3);
+  /* Argument 3 must be either zero or one.  */
+  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
+    {
+      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
+	" using one");
+      op3 = const1_rtx;
+    }
+
   if (targetm.have_prefetch ())
     {
-      class expand_operand ops[3];
+      class expand_operand ops[4];
 
       create_address_operand (&ops[0], op0);
       create_integer_operand (&ops[1], INTVAL (op1));
       create_integer_operand (&ops[2], INTVAL (op2));
-      if (maybe_expand_insn (targetm.code_for_prefetch, 3, ops))
+      create_integer_operand (&ops[3], INTVAL (op3));
+      if (maybe_expand_insn (targetm.code_for_prefetch, 4, ops))
 	return;
     }
 
diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md
index e08bee197d8..0cde862bc04 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -1944,7 +1944,8 @@
 		(match_operand:DI 2 "const_int_operand")]
 	       UNSPEC_SVE_PREFETCH)
 	     (match_operand:DI 3 "const_int_operand")
-	     (match_operand:DI 4 "const_int_operand"))]
+	     (match_operand:DI 4 "const_int_operand")
+	     (const_int 1))]
   "TARGET_SVE"
   {
     operands[1] = gen_rtx_MEM (<MODE>mode, operands[1]);
@@ -1984,7 +1985,8 @@
 		(match_operand:DI 6 "const_int_operand")]
 	       UNSPEC_SVE_PREFETCH_GATHER)
 	     (match_operand:DI 7 "const_int_operand")
-	     (match_operand:DI 8 "const_int_operand"))]
+	     (match_operand:DI 8 "const_int_operand")
+	     (const_int 1))]
   "TARGET_SVE"
   {
     static const char *const insns[][2] = {
@@ -2013,7 +2015,8 @@
 		(match_operand:DI 6 "const_int_operand")]
 	       UNSPEC_SVE_PREFETCH_GATHER)
 	     (match_operand:DI 7 "const_int_operand")
-	     (match_operand:DI 8 "const_int_operand"))]
+	     (match_operand:DI 8 "const_int_operand")
+	     (const_int 1))]
   "TARGET_SVE"
   {
     static const char *const insns[][2] = {
@@ -2044,7 +2047,8 @@
 		(match_operand:DI 6 "const_int_operand")]
 	       UNSPEC_SVE_PREFETCH_GATHER)
 	     (match_operand:DI 7 "const_int_operand")
-	     (match_operand:DI 8 "const_int_operand"))]
+	     (match_operand:DI 8 "const_int_operand")
+	     (const_int 1))]
   "TARGET_SVE"
   {
     static const char *const insns[][2] = {
@@ -2074,7 +2078,8 @@
 		(match_operand:DI 6 "const_int_operand")]
 	       UNSPEC_SVE_PREFETCH_GATHER)
 	     (match_operand:DI 7 "const_int_operand")
-	     (match_operand:DI 8 "const_int_operand"))]
+	     (match_operand:DI 8 "const_int_operand")
+	     (const_int 1))]
   "TARGET_SVE"
   {
     static const char *const insns[][2] = {
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f2e3d905dbb..94fa6b4200c 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -818,10 +818,25 @@
   [(set_attr "type" "no_insn")]
 )
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand:DI 0 "aarch64_prefetch_operand")
+            (match_operand:QI 1 "const_int_operand")
+            (match_operand:QI 2 "const_int_operand")
+	    (match_operand:QI 3 "const_int_operand"))]
+  ""
+  {
+    if (INTVAL (operands[3]) == 0)
+    {
+      warning (0, "instruction prefetch is not supported; using data prefetch");
+      operands[3] = const1_rtx;
+    }
+  })
+
+(define_insn "*prefetch"
   [(prefetch (match_operand:DI 0 "aarch64_prefetch_operand" "Dp")
             (match_operand:QI 1 "const_int_operand" "")
-            (match_operand:QI 2 "const_int_operand" ""))]
+            (match_operand:QI 2 "const_int_operand" "")
+	    (const_int 1))]
   ""
   {
     const char * pftype[2][4] =
diff --git a/gcc/config/alpha/alpha.md b/gcc/config/alpha/alpha.md
index 87514330c22..46fd6a7b7cb 100644
--- a/gcc/config/alpha/alpha.md
+++ b/gcc/config/alpha/alpha.md
@@ -5176,10 +5176,25 @@
 ;;
 ;; On EV6, these become official prefetch instructions.
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand:DI 0 "address_operand")
+	     (match_operand:DI 1 "const_int_operand")
+	     (match_operand:DI 2 "const_int_operand")
+	     (match_operand:DI 3 "const_int_operand"))]
+  "TARGET_FIXUP_EV5_PREFETCH || alpha_cpu == PROCESSOR_EV6"
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
+
+(define_insn "*prefetch"
   [(prefetch (match_operand:DI 0 "address_operand" "p")
 	     (match_operand:DI 1 "const_int_operand" "n")
-	     (match_operand:DI 2 "const_int_operand" "n"))]
+	     (match_operand:DI 2 "const_int_operand" "n")
+	     (const_int 1))]
   "TARGET_FIXUP_EV5_PREFETCH || alpha_cpu == PROCESSOR_EV6"
 {
   /* Interpret "no temporal locality" as this data should be evicted once
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 458d3edf716..9607a0dd572 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -5255,14 +5255,22 @@ archs4x, archs4xd"
 (define_expand "prefetch"
   [(prefetch (match_operand:SI 0 "address_operand" "")
 	     (match_operand:SI 1 "const_int_operand" "")
-	     (match_operand:SI 2 "const_int_operand" ""))]
+	     (match_operand:SI 2 "const_int_operand" "")
+	     (match_operand:SI 3 "const_int_operand" ""))]
   "TARGET_HS"
-  "")
+  {
+    if (INTVAL (operands[3]) == 0)
+    {
+      warning (0, "instruction prefetch is not supported; using data prefetch");
+      operands[3] = const1_rtx;
+    }
+  })
 
 (define_insn "prefetch_1"
   [(prefetch (match_operand:SI 0 "register_operand" "r")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   "TARGET_HS"
   {
    if (INTVAL (operands[1]))
@@ -5277,7 +5285,8 @@ archs4x, archs4xd"
   [(prefetch (plus:SI (match_operand:SI 0 "register_operand" "r,r,r")
 		      (match_operand:SI 1 "nonmemory_operand" "r,Cm2,Cal"))
 	     (match_operand:SI 2 "const_int_operand" "n,n,n")
-	     (match_operand:SI 3 "const_int_operand" "n,n,n"))]
+	     (match_operand:SI 3 "const_int_operand" "n,n,n")
+	     (const_int 1))]
   "TARGET_HS"
   {
    if (INTVAL (operands[2]))
@@ -5291,7 +5300,8 @@ archs4x, archs4xd"
 (define_insn "prefetch_3"
   [(prefetch (match_operand:SI 0 "address_operand" "p")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   "TARGET_HS"
   {
    operands[0] = gen_rtx_MEM (SImode, operands[0]);
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 69bf343fb0e..7f2ec97406f 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -12206,10 +12206,25 @@
 
 ;; V5E instructions.
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand:SI 0 "address_operand")
+	     (match_operand:SI 1 "")
+	     (match_operand:SI 2 "")
+	     (match_operand:SI 3 ""))]
+  "TARGET_32BIT && arm_arch5te"
+  {
+    if (INTVAL (operands[3]) == 0)
+    {
+      warning (0, "instruction prefetch is not supported; using data prefetch");
+      operands[3] = const1_rtx;
+    }
+  })
+
+(define_insn "*prefetch"
   [(prefetch (match_operand:SI 0 "address_operand" "p")
 	     (match_operand:SI 1 "" "")
-	     (match_operand:SI 2 "" ""))]
+	     (match_operand:SI 2 "" "")
+	     (const_int 1))]
   "TARGET_32BIT && arm_arch5te"
   "pld\\t%a0"
   [(set_attr "type" "load_4")]
diff --git a/gcc/config/frv/frv.md b/gcc/config/frv/frv.md
index 6258fe3b99e..2fb9de593c9 100644
--- a/gcc/config/frv/frv.md
+++ b/gcc/config/frv/frv.md
@@ -7631,7 +7631,8 @@
   [(prefetch (unspec:SI [(match_operand:SI 0 "register_operand" "r")]
 			UNSPEC_PREFETCH0)
 	     (const_int 0)
-	     (const_int 0))]
+	     (const_int 0)
+	     (const_int 1))]
   ""
   "dcpl %0, gr0, #0"
   [(set_attr "length" "4")])
@@ -7640,7 +7641,8 @@
   [(prefetch (unspec:SI [(match_operand:SI 0 "register_operand" "r")]
 			UNSPEC_PREFETCH)
 	     (const_int 0)
-	     (const_int 0))]
+	     (const_int 0)
+	     (const_int 1))]
   "TARGET_FR500_FR550_BUILTINS"
   "nop.p\\n\\tnldub @(%0, gr0), gr0"
   [(set_attr "length" "8")])
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 8e847520491..c65cf14b9f4 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -23635,9 +23635,15 @@
 (define_expand "prefetch"
   [(prefetch (match_operand 0 "address_operand")
 	     (match_operand:SI 1 "const_int_operand")
-	     (match_operand:SI 2 "const_int_operand"))]
+	     (match_operand:SI 2 "const_int_operand")
+	     (match_operand:SI 3 "const_int_operand"))]
   "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1"
 {
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
   bool write = operands[1] != const0_rtx;
   int locality = INTVAL (operands[2]);
 
@@ -23679,7 +23685,8 @@
 (define_insn "*prefetch_sse"
   [(prefetch (match_operand 0 "address_operand" "p")
 	     (const_int 0)
-	     (match_operand:SI 1 "const_int_operand"))]
+	     (match_operand:SI 1 "const_int_operand")
+	     (const_int 1))]
   "TARGET_PREFETCH_SSE"
 {
   static const char * const patterns[4] = {
@@ -23700,7 +23707,8 @@
 (define_insn "*prefetch_3dnow"
   [(prefetch (match_operand 0 "address_operand" "p")
 	     (match_operand:SI 1 "const_int_operand")
-	     (const_int 3))]
+	     (const_int 3)
+	     (const_int 1))]
   "TARGET_3DNOW || TARGET_PRFCHW || TARGET_PREFETCHWT1"
 {
   if (operands[1] == const0_rtx)
@@ -23716,7 +23724,8 @@
 (define_insn "*prefetch_prefetchwt1"
   [(prefetch (match_operand 0 "address_operand" "p")
 	     (const_int 1)
-	     (const_int 2))]
+	     (const_int 2)
+	     (const_int 1))]
   "TARGET_PREFETCHWT1"
   "prefetchwt1\t%a0";
   [(set_attr "type" "sse")
diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
index 5d1d47da55b..9fbbea3412a 100644
--- a/gcc/config/ia64/ia64.md
+++ b/gcc/config/ia64/ia64.md
@@ -5018,10 +5018,25 @@
   "break.f 0"
   [(set_attr "itanium_class" "nop_f")])
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand:DI 0 "address_operand")
+	     (match_operand:DI 1 "const_int_operand")
+	     (match_operand:DI 2 "const_int_operand")
+	     (match_operand:DI 3 "const_int_operand"))]
+  ""
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
+
+(define_insn "*prefetch"
   [(prefetch (match_operand:DI 0 "address_operand" "p")
 	     (match_operand:DI 1 "const_int_operand" "n")
-	     (match_operand:DI 2 "const_int_operand" "n"))]
+	     (match_operand:DI 2 "const_int_operand" "n")
+	     (const_int 1))]
   ""
 {
   static const char * const alt[2][4] = {
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index e0f0a582732..b5c547806b4 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -7227,10 +7227,25 @@
 ;;
 
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand:QI 0 "address_operand")
+	     (match_operand 1 "const_int_operand")
+	     (match_operand 2 "const_int_operand")
+	     (match_operand 3 "const_int_operand"))]
+  "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
+
+(define_insn "*prefetch"
   [(prefetch (match_operand:QI 0 "address_operand" "ZD")
 	     (match_operand 1 "const_int_operand" "n")
-	     (match_operand 2 "const_int_operand" "n"))]
+	     (match_operand 2 "const_int_operand" "n")
+	     (const_int 1))]
   "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
 {
   if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT)
@@ -7257,7 +7272,8 @@
   [(prefetch (plus:P (match_operand:P 0 "register_operand" "d")
 		     (match_operand:P 1 "register_operand" "d"))
 	     (match_operand 2 "const_int_operand" "n")
-	     (match_operand 3 "const_int_operand" "n"))]
+	     (match_operand 3 "const_int_operand" "n")
+	     (const_int 1))]
   "ISA_HAS_PREFETCHX && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT"
 {
   if (TARGET_LOONGSON_EXT)
diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index 76ae35d4cfa..a7469074c01 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -10201,9 +10201,16 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
 (define_expand "prefetch"
   [(match_operand 0 "address_operand" "")
    (match_operand 1 "const_int_operand" "")
-   (match_operand 2 "const_int_operand" "")]
+   (match_operand 2 "const_int_operand" "")
+   (match_operand 3 "const_int_operand" "")]
   "TARGET_PA_20"
 {
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+
   operands[0] = copy_addr_to_reg (operands[0]);
   emit_insn (gen_prefetch_20 (operands[0], operands[1], operands[2]));
   DONE;
@@ -10212,7 +10219,8 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
 (define_insn "prefetch_20"
   [(prefetch (match_operand 0 "pmode_register_operand" "r")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   "TARGET_PA_20"
 {
   /* The SL cache-control completer indicates good spatial locality but
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ad5a4cf2ef8..21ff09eca93 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14060,10 +14060,25 @@
   DONE;
 })
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand 0 "indexed_or_indirect_address")
+	     (match_operand:SI 1 "const_int_operand")
+	     (match_operand:SI 2 "const_int_operand")
+	     (match_operand:SI 3 "const_int_operand"))]
+  ""
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
+
+(define_insn "*prefetch"
   [(prefetch (match_operand 0 "indexed_or_indirect_address" "a")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   ""
 {
 
diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index ae309471f04..3fc5ae196b8 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -5697,13 +5697,13 @@ s390_expand_cpymem (rtx dst, rtx src, rtx len)
 
 	  /* Issue a read prefetch for the +3 cache line.  */
 	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, src_addr, GEN_INT (768)),
-				   const0_rtx, const0_rtx);
+				   const0_rtx, const0_rtx, const1_rtx);
 	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
 	  emit_insn (prefetch);
 
 	  /* Issue a write prefetch for the +3 cache line.  */
 	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, GEN_INT (768)),
-				   const1_rtx, const0_rtx);
+				   const1_rtx, const0_rtx, const1_rtx);
 	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
 	  emit_insn (prefetch);
 	}
@@ -5872,7 +5872,7 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
 	  /* Issue a write prefetch.  */
 	  rtx distance = GEN_INT (TARGET_SETMEM_PREFETCH_DISTANCE);
 	  rtx prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, distance),
-				       const1_rtx, const0_rtx);
+				       const1_rtx, const0_rtx, const1_rtx);
 	  emit_insn (prefetch);
 	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
 	}
@@ -5999,13 +5999,13 @@ s390_expand_cmpmem (rtx target, rtx op0, rtx op1, rtx len)
 
 	  /* Issue a read prefetch for the +2 cache line of operand 1.  */
 	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, addr0, GEN_INT (512)),
-				   const0_rtx, const0_rtx);
+				   const0_rtx, const0_rtx, const1_rtx);
 	  emit_insn (prefetch);
 	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
 
 	  /* Issue a read prefetch for the +2 cache line of operand 2.  */
 	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, addr1, GEN_INT (512)),
-				   const0_rtx, const0_rtx);
+				   const0_rtx, const0_rtx, const1_rtx);
 	  emit_insn (prefetch);
 	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
 	}
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 962927c3112..4b094aa2bcf 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -11601,10 +11601,25 @@
 ; Data prefetch patterns
 ;
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand 0    "address_operand")
+	     (match_operand:SI 1 "const_int_operand")
+	     (match_operand:SI 2 "const_int_operand")
+             (match_operand:SI 3 "const_int_operand"))]
+  "TARGET_Z10"
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
+
+(define_insn "*prefetch"
   [(prefetch (match_operand 0    "address_operand"   "ZT,X")
 	     (match_operand:SI 1 "const_int_operand" " n,n")
-	     (match_operand:SI 2 "const_int_operand" " n,n"))]
+	     (match_operand:SI 2 "const_int_operand" " n,n")
+             (const_int 1))]
   "TARGET_Z10"
 {
   switch (which_alternative)
diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
index 59a7b216433..54a8270e80e 100644
--- a/gcc/config/sh/sh.md
+++ b/gcc/config/sh/sh.md
@@ -10928,13 +10928,22 @@
 (define_expand "prefetch"
   [(prefetch (match_operand 0 "address_operand" "")
 	     (match_operand:SI 1 "const_int_operand" "")
-	     (match_operand:SI 2 "const_int_operand" ""))]
-  "(TARGET_SH2A || TARGET_SH3) && !TARGET_VXWORKS_RTP")
+	     (match_operand:SI 2 "const_int_operand" "")
+	     (match_operand:SI 3 "const_int_operand" ""))]
+  "(TARGET_SH2A || TARGET_SH3) && !TARGET_VXWORKS_RTP"
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
 
 (define_insn "*prefetch"
   [(prefetch (match_operand:SI 0 "register_operand" "r")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   "(TARGET_SH2A || TARGET_SH3) && ! TARGET_VXWORKS_RTP"
   "pref	@%0"
   [(set_attr "type" "other")])
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 691e707863a..04cb6935b1b 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -7816,9 +7816,16 @@ visl")
 (define_expand "prefetch"
   [(match_operand 0 "address_operand" "")
    (match_operand 1 "const_int_operand" "")
-   (match_operand 2 "const_int_operand" "")]
+   (match_operand 2 "const_int_operand" "")
+   (match_operand 3 "const_int_operand" "")]
   "TARGET_V9"
 {
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+
   if (TARGET_ARCH64)
     emit_insn (gen_prefetch_64 (operands[0], operands[1], operands[2]));
   else
@@ -7829,7 +7836,8 @@ visl")
 (define_insn "prefetch_64"
   [(prefetch (match_operand:DI 0 "address_operand" "p")
 	     (match_operand:DI 1 "const_int_operand" "n")
-	     (match_operand:DI 2 "const_int_operand" "n"))]
+	     (match_operand:DI 2 "const_int_operand" "n")
+	     (const_int 1))]
   ""
 {
   static const char * const prefetch_instr[2][2] = {
@@ -7855,7 +7863,8 @@ visl")
 (define_insn "prefetch_32"
   [(prefetch (match_operand:SI 0 "address_operand" "p")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   ""
 {
   static const char * const prefetch_instr[2][2] = {
diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 43c9ee8bffe..592f4b0e4dd 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -3454,7 +3454,7 @@ position of @var{base}, @var{min} and @var{max} to the containing insn
 and of @var{min} and @var{max} to @var{base}.  See rtl.def for details.
 
 @findex prefetch
-@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality})
+@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality} @var{cache})
 Represents prefetch of memory at address @var{addr}.
 Operand @var{rw} is 1 if the prefetch is for data to be written, 0 otherwise;
 targets that do not support write prefetches should treat this as a normal
@@ -3462,6 +3462,10 @@ prefetch.
 Operand @var{locality} specifies the amount of temporal locality; 0 if there
 is none or 1, 2, or 3 for increasing levels of temporal locality;
 targets that do not support locality hints should ignore this.
+Operand @var{cache} is 1 if the prefetch is prefetching data, 0 for prefetching
+instruction;
+targets that do not support instruction prefetch should treat all as data
+prefetch.
 
 This insn is used to minimize cache-miss latency by moving data into a
 cache before it is accessed.  It should use only non-faulting data prefetch
diff --git a/gcc/rtl.def b/gcc/rtl.def
index 08e31fa3544..f2e37d55023 100644
--- a/gcc/rtl.def
+++ b/gcc/rtl.def
@@ -277,10 +277,11 @@ DEF_RTL_EXPR(ADDR_DIFF_VEC, "addr_diff_vec", "eEee0", RTX_EXTRA)
    Operand 3 is the level of temporal locality; 0 means there is no
    temporal locality and 1, 2, and 3 are for increasing levels of temporal
    locality.
+   Operand 4 is 1 for prefetch data, 0 for prefetch instrction.
 
-   The attributes specified by operands 2 and 3 are ignored for targets
+   The attributes specified by operands 2, 3 and 4 are ignored for targets
    whose prefetch instructions do not support them.  */
-DEF_RTL_EXPR(PREFETCH, "prefetch", "eee", RTX_EXTRA)
+DEF_RTL_EXPR(PREFETCH, "prefetch", "eeee", RTX_EXTRA)
 
 /* ----------------------------------------------------------------------
    At the top level of an instruction (perhaps under PARALLEL).
diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index 56da7435a28..7eeef285f1e 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -6196,7 +6196,7 @@ setup_reg_subrtx_bounds (unsigned int code)
   while (format[i] == 'e');
   rtx_all_subrtx_bounds[code].count = i - rtx_all_subrtx_bounds[code].start;
   /* rtl-iter.h relies on this.  */
-  gcc_checking_assert (rtx_all_subrtx_bounds[code].count <= 3);
+  gcc_checking_assert (rtx_all_subrtx_bounds[code].count <= 4);
 
   for (; format[i]; ++i)
     if (format[i] == 'E' || format[i] == 'V' || format[i] == 'e')
diff --git a/gcc/target-insns.def b/gcc/target-insns.def
index de8c0092f98..ca13d1c4393 100644
--- a/gcc/target-insns.def
+++ b/gcc/target-insns.def
@@ -76,7 +76,7 @@ DEF_TARGET_INSN (omp_simt_ordered, (rtx x0, rtx x1))
 DEF_TARGET_INSN (omp_simt_vote_any, (rtx x0, rtx x1))
 DEF_TARGET_INSN (omp_simt_xchg_bfly, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (omp_simt_xchg_idx, (rtx x0, rtx x1, rtx x2))
-DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2))
+DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2, rtx x3))
 DEF_TARGET_INSN (probe_stack, (rtx x0))
 DEF_TARGET_INSN (probe_stack_address, (rtx x0))
 DEF_TARGET_INSN (prologue, (void))
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
index 4ee05a94d9f..ccc5fab15e5 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
@@ -1,57 +1,62 @@
 /* Test that __builtin_prefetch does no harm.
 
-   Prefetch using all valid combinations of rw and locality values.
+   Prefetch using all valid combinations of cache, rw and locality values.
    These must be compile-time constants.  */
 
 #define NO_TEMPORAL_LOCALITY 0
 #define LOW_TEMPORAL_LOCALITY 1
-#define MODERATE_TEMPORAL_LOCALITY 1
+#define MODERATE_TEMPORAL_LOCALITY 2
 #define HIGH_TEMPORAL_LOCALITY 3
 
 #define WRITE_ACCESS 1
 #define READ_ACCESS 0
 
+#define DATA_PRFCH 1
+#define INST_PRFCH 0
+
 enum locality { none, low, moderate, high };
 enum rw { read, write };
+enum cache { inst, data };
 
 int arr[10];
 
 void
 good_const (const int *p)
 {
-  __builtin_prefetch (p, 0, 0);
-  __builtin_prefetch (p, 0, 1);
-  __builtin_prefetch (p, 0, 2);
-  __builtin_prefetch (p, READ_ACCESS, 3);
-  __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY);
-  __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY);
-  __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY);
-  __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY);
+  __builtin_prefetch (p, 0, 0, 1);
+  __builtin_prefetch (p, 0, 1, 1);
+  __builtin_prefetch (p, 0, 2, 1);
+  __builtin_prefetch (p, READ_ACCESS, 3, 1);
+  __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY, 1);
+  __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY, 1);
+  __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY, 1);
+  __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY, DATA_PRFCH);
 }
 
 void
 good_enum (const int *p)
 {
-    __builtin_prefetch (p, read, none);
-    __builtin_prefetch (p, read, low);
-    __builtin_prefetch (p, read, moderate);
-    __builtin_prefetch (p, read, high);
-    __builtin_prefetch (p, write, none);
-    __builtin_prefetch (p, write, low);
-    __builtin_prefetch (p, write, moderate);
-    __builtin_prefetch (p, write, high);
+    __builtin_prefetch (p, read, none, data);
+    __builtin_prefetch (p, read, low, data);
+    __builtin_prefetch (p, read, moderate, data);
+    __builtin_prefetch (p, read, high, data);
+    __builtin_prefetch (p, write, none, data);
+    __builtin_prefetch (p, write, low, data);
+    __builtin_prefetch (p, write, moderate, data);
+    __builtin_prefetch (p, write, high, data);
 }
 
 void
 good_expr (const int *p)
 {
-  __builtin_prefetch (p, 1 - 1, 6 - (2 * 3));
-  __builtin_prefetch (p, 1 + 0, 1 + 2);
+  __builtin_prefetch (p, 1 - 1, 6 - (2 * 3), 1 + 0);
+  __builtin_prefetch (p, 1 + 0, 1 + 2, 0 + 1);
 }
 
 void
 good_vararg (const int *p)
 {
+  __builtin_prefetch (p, 0, 3, 1);
   __builtin_prefetch (p, 0, 3);
   __builtin_prefetch (p, 0);
   __builtin_prefetch (p, 1);
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
index 530a1b0ef0d..6aff1f281e0 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
@@ -26,9 +26,9 @@ struct S *ptr_str = &str;
 void
 simple_global ()
 {
-  __builtin_prefetch (glob_int_arr, 0, 0);
-  __builtin_prefetch (glob_ptr_int, 0, 0);
-  __builtin_prefetch (&glob_int, 0, 0);
+  __builtin_prefetch (glob_int_arr, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_int, 0, 0, 1);
+  __builtin_prefetch (&glob_int, 0, 0, 1);
 }
 
 /* Prefetch file-level static variables using the address of the variable.  */
@@ -36,9 +36,9 @@ simple_global ()
 void
 simple_file ()
 {
-  __builtin_prefetch (stat_int_arr, 0, 0);
-  __builtin_prefetch (stat_ptr_int, 0, 0);
-  __builtin_prefetch (&stat_int, 0, 0);
+  __builtin_prefetch (stat_int_arr, 0, 0, 1);
+  __builtin_prefetch (stat_ptr_int, 0, 0, 1);
+  __builtin_prefetch (&stat_int, 0, 0, 1);
 }
 
 /* Prefetch local static variables using the address of the variable.  */
@@ -49,9 +49,9 @@ simple_static_local ()
   static int gx[100];
   static int *hx = gx;
   static int ix;
-  __builtin_prefetch (gx, 0, 0);
-  __builtin_prefetch (hx, 0, 0);
-  __builtin_prefetch (&ix, 0, 0);
+  __builtin_prefetch (gx, 0, 0, 1);
+  __builtin_prefetch (hx, 0, 0, 1);
+  __builtin_prefetch (&ix, 0, 0, 1);
 }
 
 /* Prefetch local stack variables using the address of the variable.  */
@@ -62,9 +62,9 @@ simple_local ()
   int gx[100];
   int *hx = gx;
   int ix;
-  __builtin_prefetch (gx, 0, 0);
-  __builtin_prefetch (hx, 0, 0);
-  __builtin_prefetch (&ix, 0, 0);
+  __builtin_prefetch (gx, 0, 0, 1);
+  __builtin_prefetch (hx, 0, 0, 1);
+  __builtin_prefetch (&ix, 0, 0, 1);
 }
 
 /* Prefetch arguments using the address of the variable.  */
@@ -72,9 +72,9 @@ simple_local ()
 void
 simple_arg (int g[100], int *h, int i)
 {
-  __builtin_prefetch (g, 0, 0);
-  __builtin_prefetch (h, 0, 0);
-  __builtin_prefetch (&i, 0, 0);
+  __builtin_prefetch (g, 0, 0, 1);
+  __builtin_prefetch (h, 0, 0, 1);
+  __builtin_prefetch (&i, 0, 0, 1);
 }
 
 /* Prefetch using address expressions involving global variables.  */
@@ -82,25 +82,25 @@ simple_arg (int g[100], int *h, int i)
 void
 expr_global (void)
 {
-  __builtin_prefetch (&str, 0, 0);
-  __builtin_prefetch (ptr_str, 0, 0);
-  __builtin_prefetch (&str.b, 0, 0);
-  __builtin_prefetch (&ptr_str->b, 0, 0);
-  __builtin_prefetch (&str.d, 0, 0);
-  __builtin_prefetch (&ptr_str->d, 0, 0);
-  __builtin_prefetch (str.next, 0, 0);
-  __builtin_prefetch (ptr_str->next, 0, 0);
-  __builtin_prefetch (str.next->d, 0, 0);
-  __builtin_prefetch (ptr_str->next->d, 0, 0);
-
-  __builtin_prefetch (&glob_int_arr, 0, 0);
-  __builtin_prefetch (glob_ptr_int, 0, 0);
-  __builtin_prefetch (&glob_int_arr[2], 0, 0);
-  __builtin_prefetch (&glob_ptr_int[3], 0, 0);
-  __builtin_prefetch (glob_int_arr+3, 0, 0);
-  __builtin_prefetch (glob_int_arr+glob_int, 0, 0);
-  __builtin_prefetch (glob_ptr_int+5, 0, 0);
-  __builtin_prefetch (glob_ptr_int+glob_int, 0, 0);
+  __builtin_prefetch (&str, 0, 0, 1);
+  __builtin_prefetch (ptr_str, 0, 0, 1);
+  __builtin_prefetch (&str.b, 0, 0, 1);
+  __builtin_prefetch (&ptr_str->b, 0, 0, 1);
+  __builtin_prefetch (&str.d, 0, 0, 1);
+  __builtin_prefetch (&ptr_str->d, 0, 0, 1);
+  __builtin_prefetch (str.next, 0, 0, 1);
+  __builtin_prefetch (ptr_str->next, 0, 0, 1);
+  __builtin_prefetch (str.next->d, 0, 0, 1);
+  __builtin_prefetch (ptr_str->next->d, 0, 0, 1);
+
+  __builtin_prefetch (&glob_int_arr, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_int, 0, 0, 1);
+  __builtin_prefetch (&glob_int_arr[2], 0, 0, 1);
+  __builtin_prefetch (&glob_ptr_int[3], 0, 0, 1);
+  __builtin_prefetch (glob_int_arr+3, 0, 0, 1);
+  __builtin_prefetch (glob_int_arr+glob_int, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_int+5, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_int+glob_int, 0, 0, 1);
 }
 
 /* Prefetch using address expressions involving local variables.  */
@@ -114,25 +114,25 @@ expr_local (void)
   struct S *pt = &t;
   int j = 4;
 
-  __builtin_prefetch (&t, 0, 0);
-  __builtin_prefetch (pt, 0, 0);
-  __builtin_prefetch (&t.b, 0, 0);
-  __builtin_prefetch (&pt->b, 0, 0);
-  __builtin_prefetch (&t.d, 0, 0);
-  __builtin_prefetch (&pt->d, 0, 0);
-  __builtin_prefetch (t.next, 0, 0);
-  __builtin_prefetch (pt->next, 0, 0);
-  __builtin_prefetch (t.next->d, 0, 0);
-  __builtin_prefetch (pt->next->d, 0, 0);
-
-  __builtin_prefetch (&b, 0, 0);
-  __builtin_prefetch (pb, 0, 0);
-  __builtin_prefetch (&b[2], 0, 0);
-  __builtin_prefetch (&pb[3], 0, 0);
-  __builtin_prefetch (b+3, 0, 0);
-  __builtin_prefetch (b+j, 0, 0);
-  __builtin_prefetch (pb+5, 0, 0);
-  __builtin_prefetch (pb+j, 0, 0);
+  __builtin_prefetch (&t, 0, 0, 1);
+  __builtin_prefetch (pt, 0, 0, 1);
+  __builtin_prefetch (&t.b, 0, 0, 1);
+  __builtin_prefetch (&pt->b, 0, 0, 1);
+  __builtin_prefetch (&t.d, 0, 0, 1);
+  __builtin_prefetch (&pt->d, 0, 0, 1);
+  __builtin_prefetch (t.next, 0, 0, 1);
+  __builtin_prefetch (pt->next, 0, 0, 1);
+  __builtin_prefetch (t.next->d, 0, 0, 1);
+  __builtin_prefetch (pt->next->d, 0, 0, 1);
+
+  __builtin_prefetch (&b, 0, 0, 1);
+  __builtin_prefetch (pb, 0, 0, 1);
+  __builtin_prefetch (&b[2], 0, 0, 1);
+  __builtin_prefetch (&pb[3], 0, 0, 1);
+  __builtin_prefetch (b+3, 0, 0, 1);
+  __builtin_prefetch (b+j, 0, 0, 1);
+  __builtin_prefetch (pb+5, 0, 0, 1);
+  __builtin_prefetch (pb+j, 0, 0, 1);
 }
 
 int
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
index 2e2e808c172..38ce410384a 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
@@ -36,11 +36,11 @@ volatile struct S * volatile vol_ptr_vol_str = &vol_str;
 void
 simple_vol_global ()
 {
-  __builtin_prefetch (glob_vol_int_arr, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_int, 0, 0);
-  __builtin_prefetch (glob_ptr_vol_int, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0);
-  __builtin_prefetch (&glob_vol_int, 0, 0);
+  __builtin_prefetch (glob_vol_int_arr, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_int, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (&glob_vol_int, 0, 0, 1);
 }
 
 /* Prefetch volatile static variables using the address of the variable.  */
@@ -48,11 +48,11 @@ simple_vol_global ()
 void
 simple_vol_file ()
 {
-  __builtin_prefetch (stat_vol_int_arr, 0, 0);
-  __builtin_prefetch (stat_vol_ptr_int, 0, 0);
-  __builtin_prefetch (stat_ptr_vol_int, 0, 0);
-  __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0);
-  __builtin_prefetch (&stat_vol_int, 0, 0);
+  __builtin_prefetch (stat_vol_int_arr, 0, 0, 1);
+  __builtin_prefetch (stat_vol_ptr_int, 0, 0, 1);
+  __builtin_prefetch (stat_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (&stat_vol_int, 0, 0, 1);
 }
 
 /* Prefetch using address expressions involving volatile global variables.  */
@@ -60,43 +60,43 @@ simple_vol_file ()
 void
 expr_vol_global (void)
 {
-  __builtin_prefetch (&vol_str, 0, 0);
-  __builtin_prefetch (ptr_vol_str, 0, 0);
-  __builtin_prefetch (vol_ptr_str, 0, 0);
-  __builtin_prefetch (vol_ptr_vol_str, 0, 0);
-  __builtin_prefetch (&vol_str.b, 0, 0);
-  __builtin_prefetch (&ptr_vol_str->b, 0, 0);
-  __builtin_prefetch (&vol_ptr_str->b, 0, 0);
-  __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0);
-  __builtin_prefetch (&vol_str.d, 0, 0);
-  __builtin_prefetch (&vol_ptr_str->d, 0, 0);
-  __builtin_prefetch (&ptr_vol_str->d, 0, 0);
-  __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0);
-  __builtin_prefetch (vol_str.next, 0, 0);
-  __builtin_prefetch (vol_ptr_str->next, 0, 0);
-  __builtin_prefetch (ptr_vol_str->next, 0, 0);
-  __builtin_prefetch (vol_ptr_vol_str->next, 0, 0);
-  __builtin_prefetch (vol_str.next->d, 0, 0);
-  __builtin_prefetch (vol_ptr_str->next->d, 0, 0);
-  __builtin_prefetch (ptr_vol_str->next->d, 0, 0);
-  __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0);
+  __builtin_prefetch (&vol_str, 0, 0, 1);
+  __builtin_prefetch (ptr_vol_str, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_str, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_vol_str, 0, 0, 1);
+  __builtin_prefetch (&vol_str.b, 0, 0, 1);
+  __builtin_prefetch (&ptr_vol_str->b, 0, 0, 1);
+  __builtin_prefetch (&vol_ptr_str->b, 0, 0, 1);
+  __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0, 1);
+  __builtin_prefetch (&vol_str.d, 0, 0, 1);
+  __builtin_prefetch (&vol_ptr_str->d, 0, 0, 1);
+  __builtin_prefetch (&ptr_vol_str->d, 0, 0, 1);
+  __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0, 1);
+  __builtin_prefetch (vol_str.next, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_str->next, 0, 0, 1);
+  __builtin_prefetch (ptr_vol_str->next, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_vol_str->next, 0, 0, 1);
+  __builtin_prefetch (vol_str.next->d, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_str->next->d, 0, 0, 1);
+  __builtin_prefetch (ptr_vol_str->next->d, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0, 1);
 
-  __builtin_prefetch (&glob_vol_int_arr, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_int, 0, 0);
-  __builtin_prefetch (glob_ptr_vol_int, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0);
-  __builtin_prefetch (&glob_vol_int_arr[2], 0, 0);
-  __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0);
-  __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0);
-  __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0);
-  __builtin_prefetch (glob_vol_int_arr+3, 0, 0);
-  __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_int+5, 0, 0);
-  __builtin_prefetch (glob_ptr_vol_int+5, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0);
-  __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0);
+  __builtin_prefetch (&glob_vol_int_arr, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_int, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (&glob_vol_int_arr[2], 0, 0, 1);
+  __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0, 1);
+  __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0, 1);
+  __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0, 1);
+  __builtin_prefetch (glob_vol_int_arr+3, 0, 0, 1);
+  __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_int+5, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_vol_int+5, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0, 1);
 }
 
 int
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
index ade892b21a7..69b4cbe1854 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
@@ -17,7 +17,7 @@ int
 assign_arg_ptr (int *p)
 {
   int *q;
-  __builtin_prefetch ((q = p), 0, 0);
+  __builtin_prefetch ((q = p), 0, 0, 1);
   return q == p;
 }
 
@@ -25,7 +25,7 @@ int
 assign_glob_ptr (void)
 {
   int *q;
-  __builtin_prefetch ((q = ptr), 0, 0);
+  __builtin_prefetch ((q = ptr), 0, 0, 1);
   return q == ptr;
 }
 
@@ -33,7 +33,7 @@ int
 assign_arg_idx (int *p, int i)
 {
   int j;
-  __builtin_prefetch (&p[j = i], 0, 0);
+  __builtin_prefetch (&p[j = i], 0, 0, 1);
   return j == i;
 }
 
@@ -41,7 +41,7 @@ int
 assign_glob_idx (void)
 {
   int j;
-  __builtin_prefetch (&ptr[j = arrindex], 0, 0);
+  __builtin_prefetch (&ptr[j = arrindex], 0, 0, 1);
   return j == arrindex;
 }
 
@@ -53,7 +53,7 @@ preinc_arg_ptr (int *p)
 {
   int *q;
   q = p + 1;
-  __builtin_prefetch (++p, 0, 0);
+  __builtin_prefetch (++p, 0, 0, 1);
   return p == q;
 }
 
@@ -62,7 +62,7 @@ preinc_glob_ptr (void)
 {
   int *q;
   q = ptr + 1;
-  __builtin_prefetch (++ptr, 0, 0);
+  __builtin_prefetch (++ptr, 0, 0, 1);
   return ptr == q;
 }
 
@@ -71,7 +71,7 @@ postinc_arg_ptr (int *p)
 {
   int *q;
   q = p + 1;
-  __builtin_prefetch (p++, 0, 0);
+  __builtin_prefetch (p++, 0, 0, 1);
   return p == q;
 }
 
@@ -80,7 +80,7 @@ postinc_glob_ptr (void)
 {
   int *q;
   q = ptr + 1;
-  __builtin_prefetch (ptr++, 0, 0);
+  __builtin_prefetch (ptr++, 0, 0, 1);
   return ptr == q;
 }
 
@@ -89,7 +89,7 @@ predec_arg_ptr (int *p)
 {
   int *q;
   q = p - 1;
-  __builtin_prefetch (--p, 0, 0);
+  __builtin_prefetch (--p, 0, 0, 1);
   return p == q;
 }
 
@@ -98,7 +98,7 @@ predec_glob_ptr (void)
 {
   int *q;
   q = ptr - 1;
-  __builtin_prefetch (--ptr, 0, 0);
+  __builtin_prefetch (--ptr, 0, 0, 1);
   return ptr == q;
 }
 
@@ -107,7 +107,7 @@ postdec_arg_ptr (int *p)
 {
   int *q;
   q = p - 1;
-  __builtin_prefetch (p--, 0, 0);
+  __builtin_prefetch (p--, 0, 0, 1);
   return p == q;
 }
 
@@ -116,7 +116,7 @@ postdec_glob_ptr (void)
 {
   int *q;
   q = ptr - 1;
-  __builtin_prefetch (ptr--, 0, 0);
+  __builtin_prefetch (ptr--, 0, 0, 1);
   return ptr == q;
 }
 
@@ -124,7 +124,7 @@ int
 preinc_arg_idx (int *p, int i)
 {
   int j = i + 1;
-  __builtin_prefetch (&p[++i], 0, 0);
+  __builtin_prefetch (&p[++i], 0, 0, 1);
   return i == j;
 }
 
@@ -133,7 +133,7 @@ int
 preinc_glob_idx (void)
 {
   int j = arrindex + 1;
-  __builtin_prefetch (&ptr[++arrindex], 0, 0);
+  __builtin_prefetch (&ptr[++arrindex], 0, 0, 1);
   return arrindex == j;
 }
 
@@ -141,7 +141,7 @@ int
 postinc_arg_idx (int *p, int i)
 {
   int j = i + 1;
-  __builtin_prefetch (&p[i++], 0, 0);
+  __builtin_prefetch (&p[i++], 0, 0, 1);
   return i == j;
 }
 
@@ -149,7 +149,7 @@ int
 postinc_glob_idx (void)
 {
   int j = arrindex + 1;
-  __builtin_prefetch (&ptr[arrindex++], 0, 0);
+  __builtin_prefetch (&ptr[arrindex++], 0, 0, 1);
   return arrindex == j;
 }
 
@@ -157,7 +157,7 @@ int
 predec_arg_idx (int *p, int i)
 {
   int j = i - 1;
-  __builtin_prefetch (&p[--i], 0, 0);
+  __builtin_prefetch (&p[--i], 0, 0, 1);
   return i == j;
 }
 
@@ -165,7 +165,7 @@ int
 predec_glob_idx (void)
 {
   int j = arrindex - 1;
-  __builtin_prefetch (&ptr[--arrindex], 0, 0);
+  __builtin_prefetch (&ptr[--arrindex], 0, 0, 1);
   return arrindex == j;
 }
 
@@ -173,7 +173,7 @@ int
 postdec_arg_idx (int *p, int i)
 {
   int j = i - 1;
-  __builtin_prefetch (&p[i--], 0, 0);
+  __builtin_prefetch (&p[i--], 0, 0, 1);
   return i == j;
 }
 
@@ -181,7 +181,7 @@ int
 postdec_glob_idx (void)
 {
   int j = arrindex - 1;
-  __builtin_prefetch (&ptr[arrindex--], 0, 0);
+  __builtin_prefetch (&ptr[arrindex--], 0, 0, 1);
   return arrindex == j;
 }
 
@@ -200,7 +200,7 @@ getptr (int *p)
 int
 funccall_arg_ptr (int *p)
 {
-  __builtin_prefetch (getptr (p), 0, 0);
+  __builtin_prefetch (getptr (p), 0, 0, 1);
   return getptrcnt == 1;
 }
 
@@ -216,7 +216,7 @@ getint (int i)
 int
 funccall_arg_idx (int *p, int i)
 {
-  __builtin_prefetch (&p[getint (i)], 0, 0);
+  __builtin_prefetch (&p[getint (i)], 0, 0, 1);
   return getintcnt == 1;
 }
 
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
index f42a2c0ca87..a6fa1741888 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
@@ -18,32 +18,32 @@ int idx = 3;
 void
 arg_ptr (char *p)
 {
-  __builtin_prefetch (p, 0, 0);
+  __builtin_prefetch (p, 0, 0, 1);
 }
 
 void
 arg_idx (char *p, int i)
 {
-  __builtin_prefetch (&p[i], 0, 0);
+  __builtin_prefetch (&p[i], 0, 0, 1);
 }
 
 void
 glob_ptr (void)
 {
-  __builtin_prefetch (ptr, 0, 0);
+  __builtin_prefetch (ptr, 0, 0, 1);
 }
 
 void
 glob_idx (void)
 {
-  __builtin_prefetch (&ptr[idx], 0, 0);
+  __builtin_prefetch (&ptr[idx], 0, 0, 1);
 }
 
 int
 main ()
 {
-  __builtin_prefetch (&s.b, 0, 0);
-  __builtin_prefetch (&s.c[1], 0, 0);
+  __builtin_prefetch (&s.b, 0, 0, 1);
+  __builtin_prefetch (&s.c[1], 0, 0, 1);
 
   arg_ptr (&s.c[1]);
   arg_ptr (ptr+3);
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
index f643c5c7286..fabecaf56dc 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
@@ -25,7 +25,7 @@ prefetch_for_read (void)
 {
   int i;
   for (i = 0; i < ARRSIZE; i++)
-    __builtin_prefetch (bad_addr[i], 0, 0);
+    __builtin_prefetch (bad_addr[i], 0, 0, 1);
 }
 
 void
@@ -33,7 +33,7 @@ prefetch_for_write (void)
 {
   int i;
   for (i = 0; i < ARRSIZE; i++)
-    __builtin_prefetch (bad_addr[i], 1, 0);
+    __builtin_prefetch (bad_addr[i], 1, 0, 1);
 }
 
 int
diff --git a/gcc/testsuite/gcc.dg/builtin-prefetch-1.c b/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
index 11beb4e1bbe..84d564dc72c 100644
--- a/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
+++ b/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
@@ -1,6 +1,6 @@
 /* Test that __builtin_prefetch does no harm.
 
-   Prefetch using some invalid rw and locality values.  These must be
+   Prefetch using some invalid cache, rw and locality values.  These must be
    compile-time constants.  */
 
 /* { dg-do run } */
@@ -9,6 +9,7 @@ extern void exit (int);
 
 enum locality { none, low, moderate, high, bogus };
 enum rw { read, write };
+enum cache { inst, data };
 
 int arr[10];
 
@@ -34,6 +35,8 @@ bad (int *p)
   __builtin_prefetch (p, 0, -1);  /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
   __builtin_prefetch (p, 0, 4);   /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
   __builtin_prefetch (p, 0, bogus);   /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
+  __builtin_prefetch (p, 0, 3, -1);   /* { dg-warning "invalid fourth argument to '__builtin_prefetch'; using one" } */
+  __builtin_prefetch (p, 0, 3, bogus);   /* { dg-warning "invalid fourth argument to '__builtin_prefetch'; using one" } */
 }
 
 int
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
index 638749a5a68..eb9197b357c 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
@@ -9,14 +9,14 @@ char *msg = "howdy there";
 
 void foo (char *p)
 {
-  __builtin_prefetch (p, 0, 0);
-  __builtin_prefetch (p, 0, 1);
-  __builtin_prefetch (p, 0, 2);
-  __builtin_prefetch (p, 0, 3);
-  __builtin_prefetch (p, 1, 0);
-  __builtin_prefetch (p, 1, 1);
-  __builtin_prefetch (p, 1, 2);
-  __builtin_prefetch (p, 1, 3);
+  __builtin_prefetch (p, 0, 0, 1);
+  __builtin_prefetch (p, 0, 1, 1);
+  __builtin_prefetch (p, 0, 2, 1);
+  __builtin_prefetch (p, 0, 3, 1);
+  __builtin_prefetch (p, 1, 0, 1);
+  __builtin_prefetch (p, 1, 1, 1);
+  __builtin_prefetch (p, 1, 2, 1);
+  __builtin_prefetch (p, 1, 3, 1);
 }
 
 int main ()
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
index d793437f175..b5081815f7a 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
@@ -10,14 +10,14 @@ char *msg = "howdy there";
 
 void foo (char *p)
 {
-  __builtin_prefetch (p, 0, 0);
-  __builtin_prefetch (p, 0, 1);
-  __builtin_prefetch (p, 0, 2);
-  __builtin_prefetch (p, 0, 3);
-  __builtin_prefetch (p, 1, 0);
-  __builtin_prefetch (p, 1, 1);
-  __builtin_prefetch (p, 1, 2);
-  __builtin_prefetch (p, 1, 3);
+  __builtin_prefetch (p, 0, 0, 1);
+  __builtin_prefetch (p, 0, 1, 1);
+  __builtin_prefetch (p, 0, 2, 1);
+  __builtin_prefetch (p, 0, 3, 1);
+  __builtin_prefetch (p, 1, 0, 1);
+  __builtin_prefetch (p, 1, 1, 1);
+  __builtin_prefetch (p, 1, 2, 1);
+  __builtin_prefetch (p, 1, 3, 1);
 }
 
 int main ()
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
index 04e814d5a9c..2317f665107 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
@@ -9,14 +9,14 @@ char *msg = "howdy there";
 
 void foo (char *p)
 {
-  __builtin_prefetch (p, 0, 0);
-  __builtin_prefetch (p, 0, 1);
-  __builtin_prefetch (p, 0, 2);
-  __builtin_prefetch (p, 0, 3);
-  __builtin_prefetch (p, 1, 0);
-  __builtin_prefetch (p, 1, 1);
-  __builtin_prefetch (p, 1, 2);
-  __builtin_prefetch (p, 1, 3);
+  __builtin_prefetch (p, 0, 0, 1);
+  __builtin_prefetch (p, 0, 1, 1);
+  __builtin_prefetch (p, 0, 2, 1);
+  __builtin_prefetch (p, 0, 3, 1);
+  __builtin_prefetch (p, 1, 0, 1);
+  __builtin_prefetch (p, 1, 1, 1);
+  __builtin_prefetch (p, 1, 2, 1);
+  __builtin_prefetch (p, 1, 3, 1);
 }
 
 int main ()
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
index 3707c7074be..936ad9e79ad 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
@@ -9,14 +9,14 @@ char *msg = "howdy there";
 
 void foo (char *p)
 {
-  __builtin_prefetch (p, 0, 0);
-  __builtin_prefetch (p, 0, 1);
-  __builtin_prefetch (p, 0, 2);
-  __builtin_prefetch (p, 0, 3);
-  __builtin_prefetch (p, 1, 0);
-  __builtin_prefetch (p, 1, 1);
-  __builtin_prefetch (p, 1, 2);
-  __builtin_prefetch (p, 1, 3);
+  __builtin_prefetch (p, 0, 0, 1);
+  __builtin_prefetch (p, 0, 1, 1);
+  __builtin_prefetch (p, 0, 2, 1);
+  __builtin_prefetch (p, 0, 3, 1);
+  __builtin_prefetch (p, 1, 0, 1);
+  __builtin_prefetch (p, 1, 1, 1);
+  __builtin_prefetch (p, 1, 2, 1);
+  __builtin_prefetch (p, 1, 3, 1);
 }
 
 int main ()
diff --git a/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c b/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
new file mode 100644
index 00000000000..f082396ac2e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/alpha/prefetchi-1.c b/gcc/testsuite/gcc.target/alpha/prefetchi-1.c
new file mode 100644
index 00000000000..5d9c387e260
--- /dev/null
+++ b/gcc/testsuite/gcc.target/alpha/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=ev6" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/arc/prefetchi-1.c b/gcc/testsuite/gcc.target/arc/prefetchi-1.c
new file mode 100644
index 00000000000..7e023ab6498
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=archs" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/arm/prefetchi-1.c b/gcc/testsuite/gcc.target/arm/prefetchi-1.c
new file mode 100644
index 00000000000..0fbcb7019bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -march=armv5te" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/hppa/prefetchi-1.c b/gcc/testsuite/gcc.target/hppa/prefetchi-1.c
new file mode 100644
index 00000000000..26854a6828d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/hppa/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mpa-risc-2-0" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
index 051a1b59b5b..ea0b9f6bcef 100644
--- a/gcc/testsuite/gcc.target/i386/avx-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx-1.c
@@ -153,7 +153,7 @@
 #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
 
 /* xmmintrin.h */
-#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
+#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
 #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
 #define __builtin_ia32_vec_set_v4hi(A, D, N) \
   __builtin_ia32_vec_set_v4hi(A, D, 0)
diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-1.c b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
new file mode 100644
index 00000000000..b32d59f2e5f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad(const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index ca662f7bd47..6c9742cf494 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -125,7 +125,7 @@
 #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
 
 /* xmmintrin.h */
-#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
+#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
 #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
 #define __builtin_ia32_vec_set_v4hi(A, D, N) \
   __builtin_ia32_vec_set_v4hi(A, D, 0)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index ba1310f9f89..344913e9a90 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -94,7 +94,7 @@
 #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
 
 /* xmmintrin.h */
-#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
+#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
 #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
 #define __builtin_ia32_vec_set_v4hi(A, D, N) \
   __builtin_ia32_vec_set_v4hi(A, D, 0)
diff --git a/gcc/testsuite/gcc.target/ia64/prefetchi-1.c b/gcc/testsuite/gcc.target/ia64/prefetchi-1.c
new file mode 100644
index 00000000000..f082396ac2e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/ia64/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/mips/prefetchi-1.c b/gcc/testsuite/gcc.target/mips/prefetchi-1.c
new file mode 100644
index 00000000000..23e78a0c7ba
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-mips4 -mexplicit-relocs" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c b/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
new file mode 100644
index 00000000000..f082396ac2e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/s390/prefetchi-1.c b/gcc/testsuite/gcc.target/s390/prefetchi-1.c
new file mode 100644
index 00000000000..5ef557f1d8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mzarch -march=z10" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/sh/prefetchi-1.c b/gcc/testsuite/gcc.target/sh/prefetchi-1.c
new file mode 100644
index 00000000000..347bdea8df8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sh/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { has_pref } } } */
+/* { dg-options "-O2" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/sparc/prefetchi-1.c b/gcc/testsuite/gcc.target/sparc/prefetchi-1.c
new file mode 100644
index 00000000000..1bd7ad495e2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=v9" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
-- 
2.18.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 2/2] Support Intel prefetchit0/t1
  2022-10-14  8:34 [PATCH 0/2] Add a Fourth parameter for prefetch and Support Intel PREFETCHI Haochen Jiang
  2022-10-14  8:34 ` [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM Haochen Jiang
@ 2022-10-14  8:34 ` Haochen Jiang
  2022-10-20  3:45   ` H.J. Lu
  2022-10-19 20:49 ` [PATCH 0/2] Add a Fourth parameter for prefetch and Support Intel PREFETCHI Segher Boessenkool
  2 siblings, 1 reply; 26+ messages in thread
From: Haochen Jiang @ 2022-10-14  8:34 UTC (permalink / raw)
  To: gcc-patches
  Cc: rguenther, hongtao.liu, ubizjak, richard.earnshaw,
	richard.sandiford, marcus.shawcroft, kyrylo.tkachov, rth, gnu,
	claziss, nickc, ramana.radhakrishnan, aoliva, hubicka, mfortune,
	dje.gcc, segher, linkw, uweigand, krebbel, olegendo, davem,
	ebotcazou, jeffreyalaw, dave.anglin

gcc/ChangeLog:

	* common/config/i386/cpuinfo.h (get_available_features):
	Detect PREFETCHI.
	* common/config/i386/i386-common.cc
	(OPTION_MASK_ISA2_PREFETCHI_SET,
	OPTION_MASK_ISA2_PREFETCHI_UNSET): New.
	(ix86_handle_option): Handle -mprefetchi.
	* common/config/i386/i386-cpuinfo.h (enum processor_features):
	Add FEATURE_PREFETCHI.
	* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
	prefetchi.
	* config.gcc: Add prfchiintrin.h.
	* config/i386/cpuid.h (bit_PREFETCHI): New.
	* config/i386/i386-c.cc (ix86_target_macros_internal): Define
	__PREFETCHI__.
	* config/i386/i386-isa.def (PREFETCHI):	Add DEF_PTA(PREFETCHI).
	* config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
	Handle prefetchi.
	* config/i386/i386.md (prefetch): Add handler for prefetchi
	(*prefetch_i): New define_insn.
	* config/i386/i386.opt: Add option -mprefetchi.
	* config/i386/immintrin.h: Include prfchiintrin.h.
	* config/i386/predicates.md (local_func_symbolic_operand):
	New predicates.
	* config/i386/xmmintrin.h (enum _mm_hint): New enum for prefetchi.
	(_mm_prefetch): Handle the highest bit of enum.
	* doc/extend.texi: Document prefetchi.
	* doc/invoke.texi: Document -mprefetchi.
	* doc/sourcebuild.texi: Document target prefetchi.
	* config/i386/prfchiintrin.h: New file.

gcc/testsuite/ChangeLog:

	* g++.dg/other/i386-2.C: Add -mprefetchi.
	* g++.dg/other/i386-3.C: Ditto.
	* gcc.misc-tests/i386-pf-3dnow-1.c: Add scan-assembler-not for
	prefetchit0/t1.
	* gcc.misc-tests/i386-pf-athlon-1.c: Ditto.
	* gcc.misc-tests/i386-pf-sse-1.c: Ditto.
	* gcc.target/i386/avx-1.c: Add -mprefetchi.
	* gcc.target/i386/avx-2.c: Ditto.
	* gcc.target/i386/funcspec-56.inc: Add new target attribute.
	* gcc.target/i386/prefetchi-1.c: Rewrite testcase.
	* gcc.target/i386/prefetchi-2.c: New test.
	* gcc.target/i386/prefetchi-3.c: Ditto.
	* gcc.target/i386/sse-12.c: Add -mprefetchi.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-14.c: Ditto.
	* gcc.target/i386/sse-22.c: Add prefetchi.
	* gcc.target/i386/sse-23.c: Ditto.

Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
---
 gcc/common/config/i386/cpuinfo.h              |  2 +
 gcc/common/config/i386/i386-common.cc         | 15 ++++
 gcc/common/config/i386/i386-cpuinfo.h         |  1 +
 gcc/common/config/i386/i386-isas.h            |  1 +
 gcc/config.gcc                                |  2 +-
 gcc/config/i386/cpuid.h                       |  1 +
 gcc/config/i386/i386-c.cc                     |  2 +
 gcc/config/i386/i386-isa.def                  |  1 +
 gcc/config/i386/i386-options.cc               |  4 +-
 gcc/config/i386/i386.md                       | 90 +++++++++++++------
 gcc/config/i386/i386.opt                      |  4 +
 gcc/config/i386/immintrin.h                   |  2 +
 gcc/config/i386/predicates.md                 | 15 ++++
 gcc/config/i386/prfchiintrin.h                | 39 ++++++++
 gcc/config/i386/xmmintrin.h                   |  6 +-
 gcc/doc/extend.texi                           |  5 ++
 gcc/doc/invoke.texi                           | 10 ++-
 gcc/doc/sourcebuild.texi                      |  3 +
 gcc/testsuite/g++.dg/other/i386-2.C           |  2 +-
 gcc/testsuite/g++.dg/other/i386-3.C           |  2 +-
 .../gcc.misc-tests/i386-pf-3dnow-1.c          |  2 +
 .../gcc.misc-tests/i386-pf-athlon-1.c         |  2 +
 gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c  |  2 +
 gcc/testsuite/gcc.target/i386/avx-1.c         |  2 +-
 gcc/testsuite/gcc.target/i386/avx-2.c         |  2 +-
 gcc/testsuite/gcc.target/i386/funcspec-56.inc |  2 +
 gcc/testsuite/gcc.target/i386/prefetchi-1.c   | 36 ++++++--
 gcc/testsuite/gcc.target/i386/prefetchi-2.c   | 26 ++++++
 gcc/testsuite/gcc.target/i386/prefetchi-3.c   | 15 ++++
 gcc/testsuite/gcc.target/i386/sse-12.c        |  2 +-
 gcc/testsuite/gcc.target/i386/sse-13.c        |  2 +-
 gcc/testsuite/gcc.target/i386/sse-14.c        |  2 +-
 gcc/testsuite/gcc.target/i386/sse-22.c        |  4 +-
 gcc/testsuite/gcc.target/i386/sse-23.c        |  2 +-
 34 files changed, 259 insertions(+), 49 deletions(-)
 create mode 100644 gcc/config/i386/prfchiintrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-3.c

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 118f3a42abd..551e0483330 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -797,6 +797,8 @@ get_available_features (struct __processor_model *cpu_model,
 	set_feature (FEATURE_HRESET);
       if (eax & bit_CMPCCXADD)
 	set_feature(FEATURE_CMPCCXADD);
+      if (edx & bit_PREFETCHI)
+	set_feature (FEATURE_PREFETCHI);
       if (avx_usable)
 	{
 	  if (eax & bit_AVXVNNI)
diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i386/i386-common.cc
index f3d00ce4bc9..77ff07a3797 100644
--- a/gcc/common/config/i386/i386-common.cc
+++ b/gcc/common/config/i386/i386-common.cc
@@ -112,6 +112,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA2_AVXNECONVERT_SET OPTION_MASK_ISA2_AVXNECONVERT
 #define OPTION_MASK_ISA2_CMPCCXADD_SET OPTION_MASK_ISA2_CMPCCXADD
 #define OPTION_MASK_ISA2_AMX_FP16_SET OPTION_MASK_ISA2_AMX_FP16
+#define OPTION_MASK_ISA2_PREFETCHI_SET OPTION_MASK_ISA2_PREFETCHI
 
 /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
    as -msse4.2.  */
@@ -287,6 +288,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA2_AVXNECONVERT_UNSET OPTION_MASK_ISA2_AVXNECONVERT
 #define OPTION_MASK_ISA2_CMPCCXADD_UNSET OPTION_MASK_ISA2_CMPCCXADD
 #define OPTION_MASK_ISA2_AMX_FP16_UNSET OPTION_MASK_ISA2_AMX_FP16
+#define OPTION_MASK_ISA2_PREFETCHI_UNSET OPTION_MASK_ISA2_PREFETCHI
 
 /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
    as -mno-sse4.1. */
@@ -1211,6 +1213,19 @@ ix86_handle_option (struct gcc_options *opts,
 	}
       return true;
 
+    case OPT_mprefetchi:
+      if (value)
+	{
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_PREFETCHI_SET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_PREFETCHI_SET;
+	}
+      else
+	{
+	  opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_PREFETCHI_UNSET;
+	  opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_PREFETCHI_UNSET;
+	}
+      return true;
+
     case OPT_mfma:
       if (value)
 	{
diff --git a/gcc/common/config/i386/i386-cpuinfo.h b/gcc/common/config/i386/i386-cpuinfo.h
index f9d5b7238ea..3fe69178841 100644
--- a/gcc/common/config/i386/i386-cpuinfo.h
+++ b/gcc/common/config/i386/i386-cpuinfo.h
@@ -246,6 +246,7 @@ enum processor_features
   FEATURE_AVXNECONVERT,
   FEATURE_CMPCCXADD,
   FEATURE_AMX_FP16,
+  FEATURE_PREFETCHI,
   CPU_FEATURE_MAX
 };
 
diff --git a/gcc/common/config/i386/i386-isas.h b/gcc/common/config/i386/i386-isas.h
index 7c4a71413b5..8648ea6903c 100644
--- a/gcc/common/config/i386/i386-isas.h
+++ b/gcc/common/config/i386/i386-isas.h
@@ -182,4 +182,5 @@ ISA_NAMES_TABLE_START
 			P_NONE, "-mavxneconvert")
   ISA_NAMES_TABLE_ENTRY("cmpccxadd", FEATURE_CMPCCXADD, P_NONE, "-mcmpccxadd")
   ISA_NAMES_TABLE_ENTRY("amx-fp16", FEATURE_AMX_FP16, P_NONE, "-mamx-fp16")
+  ISA_NAMES_TABLE_ENTRY("prefetchi", FEATURE_PREFETCHI, P_NONE, "-mprefetchi")
 ISA_NAMES_TABLE_END
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 8a8712d1466..ceea7726bfd 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -423,7 +423,7 @@ i[34567]86-*-* | x86_64-*-*)
 		       hresetintrin.h keylockerintrin.h avxvnniintrin.h
 		       mwaitintrin.h avx512fp16intrin.h avx512fp16vlintrin.h
 		       avxifmaintrin.h avxvnniint8intrin.h avxneconvertintrin.h
-		       cmpccxaddintrin.h amxfp16intrin.h"
+		       cmpccxaddintrin.h amxfp16intrin.h prfchiintrin.h"
 	;;
 ia64-*-*)
 	extra_headers=ia64intrin.h
diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
index 229c15c5950..92583261883 100644
--- a/gcc/config/i386/cpuid.h
+++ b/gcc/config/i386/cpuid.h
@@ -54,6 +54,7 @@
 #define bit_AVXVNNIINT8 (1 << 4)
 #define bit_AVXNECONVERT (1 << 5)
 #define bit_CMPXCHG8B	(1 << 8)
+#define bit_PREFETCHI	(1 << 14)
 #define bit_CMOV	(1 << 15)
 #define bit_MMX		(1 << 23)
 #define bit_FXSAVE	(1 << 24)
diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc
index 3020b5f267a..74239002ed6 100644
--- a/gcc/config/i386/i386-c.cc
+++ b/gcc/config/i386/i386-c.cc
@@ -650,6 +650,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
     def_or_undef (parse_in, "__CMPCCXADD__");
   if (isa_flag2 & OPTION_MASK_ISA2_AMX_FP16)
     def_or_undef (parse_in, "__AMX_FP16__");
+  if (isa_flag2 & OPTION_MASK_ISA2_PREFETCHI)
+    def_or_undef (parse_in, "__PREFETCHI__");
   if (TARGET_IAMCU)
     {
       def_or_undef (parse_in, "__iamcu");
diff --git a/gcc/config/i386/i386-isa.def b/gcc/config/i386/i386-isa.def
index 55b25763957..f234dcc37d7 100644
--- a/gcc/config/i386/i386-isa.def
+++ b/gcc/config/i386/i386-isa.def
@@ -114,3 +114,4 @@ DEF_PTA(AVXVNNIINT8)
 DEF_PTA(AVXNECONVERT)
 DEF_PTA(CMPCCXADD)
 DEF_PTA(AMX_FP16)
+DEF_PTA(PREFETCHI)
diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc
index bf37c77589e..3f98b09e5cf 100644
--- a/gcc/config/i386/i386-options.cc
+++ b/gcc/config/i386/i386-options.cc
@@ -232,7 +232,8 @@ static struct ix86_target_opts isa2_opts[] =
   { "-mavxvnniint8",	OPTION_MASK_ISA2_AVXVNNIINT8 },
   { "-mavxneconvert",   OPTION_MASK_ISA2_AVXNECONVERT },
   { "-mcmpccxadd",      OPTION_MASK_ISA2_CMPCCXADD },
-  { "-mamx-fp16",       OPTION_MASK_ISA2_AMX_FP16 }
+  { "-mamx-fp16",       OPTION_MASK_ISA2_AMX_FP16 },
+  { "-mprefetchi",      OPTION_MASK_ISA2_PREFETCHI }
 };
 static struct ix86_target_opts isa_opts[] =
 {
@@ -1084,6 +1085,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
     IX86_ATTR_ISA ("avxneconvert",   OPT_mavxneconvert),
     IX86_ATTR_ISA ("cmpccxadd",   OPT_mcmpccxadd),
     IX86_ATTR_ISA ("amx-fp16", OPT_mamx_fp16),
+    IX86_ATTR_ISA ("prefetchi",   OPT_mprefetchi),
 
     /* enum options */
     IX86_ATTR_ENUM ("fpmath=",	OPT_mfpmath_),
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index c65cf14b9f4..fb75f57483b 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -23637,47 +23637,65 @@
 	     (match_operand:SI 1 "const_int_operand")
 	     (match_operand:SI 2 "const_int_operand")
 	     (match_operand:SI 3 "const_int_operand"))]
-  "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1"
+  "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1
+  || TARGET_PREFETCHI"
 {
-  if (INTVAL (operands[3]) == 0)
-  {
-    warning (0, "instruction prefetch is not supported; using data prefetch");
-    operands[3] = const1_rtx;
-  }
   bool write = operands[1] != const0_rtx;
   int locality = INTVAL (operands[2]);
+  bool data = operands[3] != const0_rtx;
 
   gcc_assert (IN_RANGE (locality, 0, 3));
 
-  /* Use 3dNOW prefetch in case we are asking for write prefetch not
-     supported by SSE counterpart (non-SSE2 athlon machines) or the
-     SSE prefetch is not available (K6 machines).  Otherwise use SSE
-     prefetch as it allows specifying of locality.  */
-
-  if (write)
+  if (data)
     {
-      if (TARGET_PREFETCHWT1)
-	operands[2] = GEN_INT (MAX (locality, 2)); 
-      else if (TARGET_PRFCHW)
-	operands[2] = GEN_INT (3);
-      else if (TARGET_3DNOW && !TARGET_SSE2)
-	operands[2] = GEN_INT (3);
-      else if (TARGET_PREFETCH_SSE)
-	operands[1] = const0_rtx;
+      /* Use 3dNOW prefetch in case we are asking for write prefetch not
+	 supported by SSE counterpart (non-SSE2 athlon machines) or the
+	 SSE prefetch is not available (K6 machines).  Otherwise use SSE
+	 prefetch as it allows specifying of locality.  */
+
+      if (write)
+	{
+	  if (TARGET_PREFETCHWT1)
+	    operands[2] = GEN_INT (MAX (locality, 2));
+	  else if (TARGET_PRFCHW)
+	    operands[2] = GEN_INT (3);
+	  else if (TARGET_3DNOW && !TARGET_SSE2)
+	    operands[2] = GEN_INT (3);
+	  else if (TARGET_PREFETCH_SSE)
+	    operands[1] = const0_rtx;
+	  else
+	    {
+	      gcc_assert (TARGET_3DNOW);
+	      operands[2] = GEN_INT (3);
+	    }
+	}
       else
 	{
-	  gcc_assert (TARGET_3DNOW);
-	  operands[2] = GEN_INT (3);
+	  if (TARGET_PREFETCH_SSE)
+	    ;
+	  else
+	    {
+	      gcc_assert (TARGET_3DNOW);
+	      operands[2] = GEN_INT (3);
+	    }
 	}
     }
   else
     {
-      if (TARGET_PREFETCH_SSE)
+      /* GOT/PLT_PIC should not be available for instruction prefetch.
+	 It must be real instruction address.  */
+      if (TARGET_PREFETCHI && TARGET_64BIT
+	 && local_func_symbolic_operand (operands[0], GET_MODE (operands[0])))
 	;
       else
 	{
-	  gcc_assert (TARGET_3DNOW);
-	  operands[2] = GEN_INT (3);
+	  /* Ignore the hint.  */
+	  warning (0, "instruction prefetch applies when in 64-bit mode"
+		      " with RIP-relative addressing and"
+		      " option %<-mprefetchi%>;"
+		      " they stay NOPs otherwise");
+	  emit_insn (gen_nop ());
+	  DONE;
 	}
     }
 })
@@ -23733,6 +23751,28 @@
 	(symbol_ref "memory_address_length (operands[0], false)"))
    (set_attr "memory" "none")])
 
+(define_insn "*prefetch_i"
+  [(prefetch (match_operand 0 "local_func_symbolic_operand" "p")
+	     (const_int 0)
+	     (match_operand:SI 1 "const_int_operand")
+	     (const_int 0))]
+  "TARGET_PREFETCHI"
+{
+  static const char * const patterns[2] = {
+    "prefetchit1\t%a0", "prefetchit0\t%a0"
+  };
+
+  int locality = INTVAL (operands[1]);
+  gcc_assert (IN_RANGE (locality, 2, 3));
+
+  return patterns[locality - 2];
+}
+  [(set_attr "type" "sse")
+   (set_attr "atom_sse_attr" "prefetch")
+   (set (attr "length_address")
+	(symbol_ref "memory_address_length (operands[0], false)"))
+   (set_attr "memory" "none")])
+
 (define_expand "stack_protect_set"
   [(match_operand 0 "memory_operand")
    (match_operand 1 "memory_operand")]
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index eaa43946341..1d91103cd54 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1238,3 +1238,7 @@ CMPCCXADD build-in functions and code generation.
 mamx-fp16
 Target Mask(ISA2_AMX_FP16) Var(ix86_isa_flags2) Save
 Support AMX-FP16 built-in functions and code generation.
+
+mprefetchi
+Target Mask(ISA2_PREFETCHI) Var(ix86_isa_flags2) Save
+Support PREFETCHI built-in functions and code generation.
diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
index d8415863f52..ac6402653e0 100644
--- a/gcc/config/i386/immintrin.h
+++ b/gcc/config/i386/immintrin.h
@@ -134,6 +134,8 @@
 
 #include <amxbf16intrin.h>
 
+#include <prfchiintrin.h>
+
 #include <prfchwintrin.h>
 
 #include <keylockerintrin.h>
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index c4141a96735..2a3f07224cc 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -610,6 +610,21 @@
   return false;
 })
 
+(define_predicate "local_func_symbolic_operand"
+  (match_operand 0 "local_symbolic_operand")
+{
+  if (GET_CODE (op) == CONST
+      && GET_CODE (XEXP (op, 0)) == PLUS
+      && CONST_INT_P (XEXP (XEXP (op, 0), 1)))
+    op = XEXP (XEXP (op, 0), 0);
+
+  if (GET_CODE (op) == SYMBOL_REF
+      && !SYMBOL_REF_FUNCTION_P (op))
+    return false;
+
+  return true;
+})
+
 ;; Test for a legitimate @GOTOFF operand.
 ;;
 ;; VxWorks does not impose a fixed gap between segments; the run-time
diff --git a/gcc/config/i386/prfchiintrin.h b/gcc/config/i386/prfchiintrin.h
new file mode 100644
index 00000000000..e0240740e0b
--- /dev/null
+++ b/gcc/config/i386/prfchiintrin.h
@@ -0,0 +1,39 @@
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#if !defined _IMMINTRIN_H_INCLUDED
+# error "Never use <prfchiintrin.h> directly; include <immintrin.h> instead."
+#endif
+
+#ifndef _PRFCHIINTRIN_H_INCLUDED
+#define _PRFCHIINTRIN_H_INCLUDED
+
+#ifdef __x86_64__
+extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+_m_prefetchi (void* __P)
+{
+  __builtin_prefetch (__P, 0, 3, 0 /* _MM_HINT_IT0 */);
+}
+#endif
+
+#endif /* _PRFCHIINTRIN_H_INCLUDED */
diff --git a/gcc/config/i386/xmmintrin.h b/gcc/config/i386/xmmintrin.h
index 62659080601..2fc644447e1 100644
--- a/gcc/config/i386/xmmintrin.h
+++ b/gcc/config/i386/xmmintrin.h
@@ -36,6 +36,8 @@
 /* Constants for use with _mm_prefetch.  */
 enum _mm_hint
 {
+  _MM_HINT_IT0 = 19,
+  _MM_HINT_IT1 = 18,
   /* _MM_HINT_ET is _MM_HINT_T with set 3rd bit.  */
   _MM_HINT_ET0 = 7,
   _MM_HINT_ET1 = 6,
@@ -51,11 +53,11 @@ enum _mm_hint
 extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _mm_prefetch (const void *__P, enum _mm_hint __I)
 {
-  __builtin_prefetch (__P, (__I & 0x4) >> 2, __I & 0x3);
+  __builtin_prefetch (__P, (__I & 0x4) >> 2, __I & 0x3, ((__I & 0x10) >> 4) ^ 0x1);
 }
 #else
 #define _mm_prefetch(P, I) \
-  __builtin_prefetch ((P), ((I & 0x4) >> 2), (I & 0x3))
+  __builtin_prefetch ((P), ((I & 0x4) >> 2), (I & 0x3), (((I & 0x10) >> 4) ^ 0x1))
 #endif
 
 #ifndef __SSE__
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e51d7835e69..2e0493fe8ba 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -7085,6 +7085,11 @@ Enable/disable the generation of the CMPccXADD instructions.
 @cindex @code{target("amx-fp16")} function attribute, x86
 Enable/disable the generation of the AMX-FP16 instructions.
 
+@item prefetchi
+@itemx no-prefetchi
+@cindex @code{target("prefetchi")} function attribute, x86
+Enable/disable the generation of the PREFETCHI instructions.
+
 @item cld
 @itemx no-cld
 @cindex @code{target("cld")} function attribute, x86
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1014e2ded99..07a597d1b44 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1437,6 +1437,7 @@ See RS/6000 and PowerPC Options.
 -mrdseed  -msgx -mavx512vp2intersect -mserialize -mtsxldtrk@gol
 -mamx-tile  -mamx-int8  -mamx-bf16 -muintr -mhreset -mavxvnni@gol
 -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16 @gol
+-mprefetchi @gol
 -mcldemote  -mms-bitfields  -mno-align-stringops  -minline-all-stringops @gol
 -minline-stringops-dynamically  -mstringop-strategy=@var{alg} @gol
 -mkl -mwidekl @gol
@@ -32916,6 +32917,9 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}.
 @need 200
 @itemx -mamx-fp16
 @opindex mamx-fp16
+@need 200
+@itemx -mprefetchi
+@opindex mprefetchi
 These switches enable the use of instructions in the MMX, SSE,
 SSE2, SSE3, SSSE3, SSE4, SSE4A, SSE4.1, SSE4.2, AVX, AVX2, AVX512F, AVX512PF,
 AVX512ER, AVX512CD, AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA, AVX512VBMI, SHA,
@@ -32926,9 +32930,9 @@ XSAVEOPT, XSAVEC, XSAVES, RTM, HLE, TBM, MWAITX, CLZERO, PKU, AVX512VBMI2,
 GFNI, VAES, WAITPKG, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B, AVX512BF16,
 ENQCMD, AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, AVX5124VNNIW, SERIALIZE,
 UINTR, HRESET, AMXTILE, AMXINT8, AMXBF16, KL, WIDEKL, AVXVNNI, AVX512FP16,
-AVXIFMA, AVXVNNIINT8, AVXNECONVERT, CMPCCXADD, AMX-FP16 or CLDEMOTE extended
-instruction sets. Each has a corresponding @option{-mno-} option to disable
-use of these instructions.
+AVXIFMA, AVXVNNIINT8, AVXNECONVERT, CMPCCXADD, AMX-FP16, PREFETCHI or CLDEMOTE
+extended instruction sets. Each has a corresponding @option{-mno-} option to
+disable use of these instructions.
 
 These extensions are also available as built-in functions: see
 @ref{x86 Built-in Functions}, for details of the functions enabled and
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 5de5e9576d5..58adb6516ed 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2535,6 +2535,9 @@ Target does not require strict alignment.
 @item pie_copyreloc
 The x86-64 target linker supports PIE with copy reloc.
 
+@item prefetchi
+Target supports the execution of @code{prefetchi} instructions.
+
 @item rdrand
 Target supports x86 @code{rdrand} instruction.
 
diff --git a/gcc/testsuite/g++.dg/other/i386-2.C b/gcc/testsuite/g++.dg/other/i386-2.C
index 79b84af0a75..ec3b1864ec0 100644
--- a/gcc/testsuite/g++.dg/other/i386-2.C
+++ b/gcc/testsuite/g++.dg/other/i386-2.C
@@ -1,5 +1,5 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16" } */
+/* { dg-options "-O -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt  -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16 -mprefetchi" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
diff --git a/gcc/testsuite/g++.dg/other/i386-3.C b/gcc/testsuite/g++.dg/other/i386-3.C
index c811a4454bf..542275ca057 100644
--- a/gcc/testsuite/g++.dg/other/i386-3.C
+++ b/gcc/testsuite/g++.dg/other/i386-3.C
@@ -1,5 +1,5 @@
 /* { dg-do compile { target i?86-*-* x86_64-*-* } } */
-/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16" } */
+/* { dg-options "-O -fkeep-inline-functions -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16 -mprefetchi" } */
 
 /* Test that {,x,e,p,t,s,w,a,b,i}mmintrin.h, mm3dnow.h, fma4intrin.h,
    xopintrin.h, abmintrin.h, bmiintrin.h, tbmintrin.h, lwpintrin.h,
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
index eb9197b357c..40367947fb2 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
@@ -29,3 +29,5 @@ int main ()
 /* { dg-final { scan-assembler "prefetchw" } } */
 /* { dg-final { scan-assembler-not "prefetchnta" } } */
 /* { dg-final { scan-assembler-not "prefetcht" } } */
+/* { dg-final { scan-assembler-not "prefetchit0" } } */
+/* { dg-final { scan-assembler-not "prefetchit1" } } */
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
index b5081815f7a..0dda9f65ad5 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
@@ -29,3 +29,5 @@ int main ()
 /* { dg-final { scan-assembler "prefetchw" } } */
 /* { dg-final { scan-assembler "prefetchnta" } } */
 /* { dg-final { scan-assembler "prefetcht" } } */
+/* { dg-final { scan-assembler-not "prefetchit0" } } */
+/* { dg-final { scan-assembler-not "prefetchit1" } } */
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
index 936ad9e79ad..44d92f3a06e 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
@@ -30,3 +30,5 @@ int main ()
 /* { dg-final { scan-assembler "prefetcht1" } } */
 /* { dg-final { scan-assembler "prefetcht2" } } */
 /* { dg-final { scan-assembler-not "prefetchw" } } */
+/* { dg-final { scan-assembler-not "prefetchit0" } } */
+/* { dg-final { scan-assembler-not "prefetchit1" } } */
diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
index ea0b9f6bcef..e599d1aa5d3 100644
--- a/gcc/testsuite/gcc.target/i386/avx-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -m3dnow -mavx -mavx2 -maes -mpclmul -mgfni -mavx512bw -mavx512fp16 -mavx512vl" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -m3dnow -mavx -mavx2 -maes -mpclmul -mgfni -mavx512bw -mavx512fp16 -mavx512vl -mprefetchi" } */
 /* { dg-add-options bind_pic_locally } */
 
 #include <mm_malloc.h>
diff --git a/gcc/testsuite/gcc.target/i386/avx-2.c b/gcc/testsuite/gcc.target/i386/avx-2.c
index 642ae4d7bfb..af1f796fc68 100644
--- a/gcc/testsuite/gcc.target/i386/avx-2.c
+++ b/gcc/testsuite/gcc.target/i386/avx-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -m3dnow -mavx -mavx2 -msse4a -maes -mpclmul -mavx512bw -mavx512fp16 -mavx512vl" } */
+/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -m3dnow -mavx -mavx2 -msse4a -maes -mpclmul -mavx512bw -mavx512fp16 -mavx512vl -mprefetchi" } */
 /* { dg-add-options bind_pic_locally } */
 
 #include <mm_malloc.h>
diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
index ef9d4c5f5a4..2028f869f07 100644
--- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc
+++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc
@@ -85,6 +85,7 @@ extern void test_avxvnniint8 (void)		__attribute__((__target__("avxvnniint8")));
 extern void test_avxneconvert (void)		__attribute__((__target__("avxneconvert")));
 extern void test_cmpccxadd (void)		__attribute__((__target__("cmpccxadd")));
 extern void test_amx_fp16 (void)		__attribute__((__target__("amx-fp16")));
+extern void test_prefetchi (void)               __attribute__((__target__("prefetchi")));
 
 extern void test_no_sgx (void)			__attribute__((__target__("no-sgx")));
 extern void test_no_avx5124fmaps(void)		__attribute__((__target__("no-avx5124fmaps")));
@@ -171,6 +172,7 @@ extern void test_no_avxvnniint8 (void)		__attribute__((__target__("no-avxvnniint
 extern void test_no_avxneconvert (void)		__attribute__((__target__("no-avxneconvert")));
 extern void test_no_cmpccxadd (void)            __attribute__((__target__("no-cmpccxadd")));
 extern void test_no_amx_fp16 (void)		__attribute__((__target__("no-amx-fp16")));
+extern void test_no_prefetchi (void)            __attribute__((__target__("no-prefetchi")));
 
 extern void test_arch_nocona (void)		__attribute__((__target__("arch=nocona")));
 extern void test_arch_core2 (void)		__attribute__((__target__("arch=core2")));
diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-1.c b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
index b32d59f2e5f..f6a27ce267f 100644
--- a/gcc/testsuite/gcc.target/i386/prefetchi-1.c
+++ b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
@@ -1,11 +1,33 @@
-/* { dg-do compile } */
-/* { dg-options "-O2 -msse" } */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-mprefetchi -O2" } */
+/* { dg-final { scan-assembler-times "\[ \\t\]+prefetchit0\[ \\t\]+" 2 } } */
+/* { dg-final { scan-assembler "\[ \\t\]+prefetchit1\[ \\t\]+" } } */
 
-/* Remind users that instruction prefetch is not supported yet.  */
+#include <x86intrin.h>
 
-void
-bad(const int* p)
+int
+bar (int a)
 {
-  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
-  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  return a + 1;
+}
+
+int
+foo1 (int b)
+{
+  _mm_prefetch (bar, _MM_HINT_IT0);
+  return bar (b) + 1;
+}
+
+int
+foo2 (int b)
+{
+  _mm_prefetch (bar, _MM_HINT_IT1);
+  return bar (b) + 1;
+}
+
+int
+foo3 (int b)
+{
+  _m_prefetchi (bar);
+  return bar (b) + 1;
 }
diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-2.c b/gcc/testsuite/gcc.target/i386/prefetchi-2.c
new file mode 100644
index 00000000000..19a5dd18719
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/prefetchi-2.c
@@ -0,0 +1,26 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-mprefetchi -fpie -O2" } */
+/* { dg-final { scan-assembler-not "\[ \\t\]+prefetchit0" } } */
+/* { dg-final { scan-assembler-not "\[ \\t\]+prefetchit1" } } */
+
+#include <x86intrin.h>
+
+int
+bar (int a)
+{
+  return a + 1;
+}
+
+int
+foo1 (int b)
+{
+  __builtin_prefetch (bar, 0, 3, 0); /* { dg-warning "instruction prefetch applies when in 64-bit mode with RIP-relative addressing and option '-mprefetchi'; they stay NOPs otherwise" } */
+  return bar (b) + 1;
+}
+
+int
+foo2 (int b)
+{
+  __builtin_prefetch (bar, 0, 2, 0); /* { dg-warning "instruction prefetch applies when in 64-bit mode with RIP-relative addressing and option '-mprefetchi'; they stay NOPs otherwise" } */
+  return bar (b) + 1;
+}
diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-3.c b/gcc/testsuite/gcc.target/i386/prefetchi-3.c
new file mode 100644
index 00000000000..cbca2ab34d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/prefetchi-3.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-mprefetchi -O2" } */
+/* { dg-final { scan-assembler-not "prefetchit0" } } */
+/* { dg-final { scan-assembler-not "prefetchit1" } } */
+
+#include <x86intrin.h>
+
+void* p;
+
+void extern
+prefetchi_test (void)
+{
+  __builtin_prefetch (p, 0, 3, 0); /* { dg-warning "instruction prefetch applies when in 64-bit mode with RIP-relative addressing and option '-mprefetchi'; they stay NOPs otherwise" } */
+  __builtin_prefetch (p, 0, 2, 0); /* { dg-warning "instruction prefetch applies when in 64-bit mode with RIP-relative addressing and option '-mprefetchi'; they stay NOPs otherwise" } */
+}
diff --git a/gcc/testsuite/gcc.target/i386/sse-12.c b/gcc/testsuite/gcc.target/i386/sse-12.c
index df2684abbb6..8c556f3fcc5 100644
--- a/gcc/testsuite/gcc.target/i386/sse-12.c
+++ b/gcc/testsuite/gcc.target/i386/sse-12.c
@@ -3,7 +3,7 @@
    popcntintrin.h gfniintrin.h and mm_malloc.h are usable
    with -O -std=c89 -pedantic-errors.  */
 /* { dg-do compile } */
-/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavxifma -mavxvnniint8 -mavxneconvert -mamx-fp16" } */
+/* { dg-options "-O -std=c89 -pedantic-errors -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512bw -mavx512dq -mavx512vl -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavxifma -mavxvnniint8 -mavxneconvert -mamx-fp16 -mprefetchi" } */
 
 #include <x86intrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index 6c9742cf494..ee5ba5ae4d5 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mavx512vp2intersect -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16" } */
+/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mavx512vp2intersect -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mcmpccxadd -mamx-fp16 -mprefetchi" } */
 /* { dg-add-options bind_pic_locally } */
 
 #include <mm_malloc.h>
diff --git a/gcc/testsuite/gcc.target/i386/sse-14.c b/gcc/testsuite/gcc.target/i386/sse-14.c
index 4a47d4093a2..4f3bd70d03e 100644
--- a/gcc/testsuite/gcc.target/i386/sse-14.c
+++ b/gcc/testsuite/gcc.target/i386/sse-14.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mpconfig -mwbnoinvd -mavx512vl -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mamx-fp16" } */
+/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni -mpconfig -mwbnoinvd -mavx512vl -mavx512bf16 -menqcmd -mavx512vp2intersect -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl -mavxvnni -mavx512fp16 -mavxifma -mavxvnniint8 -mavxneconvert -mamx-fp16 -mprefetchi" } */
 /* { dg-add-options bind_pic_locally } */
 
 #include <mm_malloc.h>
diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c
index 178a2fce492..8bd046b19c2 100644
--- a/gcc/testsuite/gcc.target/i386/sse-22.c
+++ b/gcc/testsuite/gcc.target/i386/sse-22.c
@@ -103,7 +103,7 @@
 
 
 #ifndef DIFFERENT_PRAGMAS
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512vbmi2,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni,avx512bitalg,avx512bf16,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16,avxifma,avxvnniint8,avxneconvert,amx-fp16")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512vbmi2,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni,avx512bitalg,avx512bf16,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16,avxifma,avxvnniint8,avxneconvert,amx-fp16,prefetchi")
 #endif
 
 /* Following intrinsics require immediate arguments.  They
@@ -220,7 +220,7 @@ test_4 (_mm_cmpestrz, int, __m128i, int, __m128i, int, 1)
 
 /* immintrin.h (AVX/AVX2/RDRND/FSGSBASE/F16C/RTM/AVX512F/SHA) */
 #ifdef DIFFERENT_PRAGMAS
-#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx512vbmi2,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni,avx512bitalg,avx512bf16,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16,avxifma,avxvnniint8,avxneconvert,amx-fp16")
+#pragma GCC target ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx512vbmi2,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni,avx512bitalg,avx512bf16,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16,avxifma,avxvnniint8,avxneconvert,amx-fp16,prefetchi")
 #endif
 #include <immintrin.h>
 test_1 (_cvtss_sh, unsigned short, float, 1)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index 344913e9a90..16ac9c9b7a4 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -847,6 +847,6 @@
 #define __builtin_ia32_cmpccxadd(A, B, C, D) __builtin_ia32_cmpccxadd(A, B, C, 1)
 #define __builtin_ia32_cmpccxadd64(A, B, C, D) __builtin_ia32_cmpccxadd64(A, B, C, 1)
 
-#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku,sgx,rdpid,gfni,avx512vbmi2,vpclmulqdq,avx512bitalg,pconfig,wbnoinvd,avx512bf16,enqcmd,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16,avxifma,avxvnniint8,avxneconvert,cmpccxadd,amx-fp16")
+#pragma GCC target ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku,sgx,rdpid,gfni,avx512vbmi2,vpclmulqdq,avx512bitalg,pconfig,wbnoinvd,avx512bf16,enqcmd,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16,avxifma,avxvnniint8,avxneconvert,cmpccxadd,amx-fp16,prefetchi")
 
 #include <x86intrin.h>
-- 
2.18.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-14  8:34 ` [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM Haochen Jiang
@ 2022-10-14  8:46   ` Hongtao Liu
  2022-10-17 15:28   ` Richard Earnshaw
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 26+ messages in thread
From: Hongtao Liu @ 2022-10-14  8:46 UTC (permalink / raw)
  To: Haochen Jiang
  Cc: gcc-patches, aoliva, richard.sandiford, uweigand, linkw, gnu,
	dje.gcc, olegendo, claziss, segher, mfortune, davem, dave.anglin,
	hubicka, richard.earnshaw, rguenther, marcus.shawcroft,
	ramana.radhakrishnan, hongtao.liu

This patch tries to add a parameter to generate instruction prefetch
instead of data prefetch. Currently, __builtin_prefetch assumes data
prefetch only.

On Fri, Oct 14, 2022 at 4:39 PM Haochen Jiang via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> gcc/ChangeLog:
>
>         * builtins.cc (expand_builtin_prefetch): Handle the fourth parameter in
>         expand function.
>         * config/aarch64/aarch64-sve.md: Add default parameter value.
>         * config/aarch64/aarch64.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/alpha/alpha.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/arc/arc.md: Add default parameter value.
>         * config/arm/arm.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/frv/frv.md: Ditto.
>         * config/i386/i386.md: Ditto.
>         * config/ia64/ia64.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/mips/mips.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/pa/pa.md: Ditto.
>         * config/rs6000/rs6000.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/s390/s390.cc (s390_expand_cpymem): Generate fourth parameter for
>         gen_prefetch call.
>         (s390_expand_setmem): Ditto.
>         (s390_expand_cmpmem): Ditto.
>         * config/s390/s390.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/sh/sh.md: Ditto.
>         * config/sparc/sparc.md: Ditto.
>         * doc/rtl.texi: Document cache variable for prefetch.
>         * rtl.def (PREFETCH): Change prefetch DEF_RTL_EXPR to add fourth parameter.
>         * rtlanal.cc (setup_reg_subrtx_bounds): Change gcc_checking_assert for
>         fourth parameter.
>         * target-insns.def (prefetch): Add fourth rtx for prefetch.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.c-torture/execute/builtin-prefetch-1.c: Add fourth parameter for
>         testcases.
>         * gcc.c-torture/execute/builtin-prefetch-2.c: Ditto.
>         * gcc.c-torture/execute/builtin-prefetch-3.c: Ditto.
>         * gcc.c-torture/execute/builtin-prefetch-4.c: Ditto.
>         * gcc.c-torture/execute/builtin-prefetch-5.c: Ditto.
>         * gcc.c-torture/execute/builtin-prefetch-6.c: Ditto.
>         * gcc.dg/builtin-prefetch-1.c: Ditto.
>         * gcc.misc-tests/i386-pf-3dnow-1.c: Ditto.
>         * gcc.misc-tests/i386-pf-athlon-1.c: Ditto.
>         * gcc.misc-tests/i386-pf-none-1.c: Ditto.
>         * gcc.misc-tests/i386-pf-sse-1.c: Ditto.
>         * gcc.target/i386/avx-1.c: Change prefetch macro define to variable args.
>         * gcc.target/i386/sse-13.c: Ditto.
>         * gcc.target/i386/sse-23.c: Ditto.
>         * gcc.target/aarch64/prefetchi-1.c: New test.
>         * gcc.target/alpha/prefetchi-1.c: Ditto.
>         * gcc.target/arc/prefetchi-1.c: Ditto.
>         * gcc.target/arm/prefetchi-1.c: Ditto.
>         * gcc.target/hppa/prefetchi-1.c: Ditto.
>         * gcc.target/i386/prefetchi-1.c: Ditto.
>         * gcc.target/ia64/prefetchi-1.c: Ditto.
>         * gcc.target/mips/prefetchi-1.c: Ditto.
>         * gcc.target/powerpc/prefetchi-1.c: Ditto.
>         * gcc.target/s390/prefetchi-1.c: Ditto.
>         * gcc.target/sh/prefetchi-1.c: Ditto.
>         * gcc.target/sparc/prefetchi-1.c: Ditto.
> ---
>  gcc/builtins.cc                               |  34 ++++--
>  gcc/config/aarch64/aarch64-sve.md             |  15 ++-
>  gcc/config/aarch64/aarch64.md                 |  19 +++-
>  gcc/config/alpha/alpha.md                     |  19 +++-
>  gcc/config/arc/arc.md                         |  20 +++-
>  gcc/config/arm/arm.md                         |  19 +++-
>  gcc/config/frv/frv.md                         |   6 +-
>  gcc/config/i386/i386.md                       |  17 ++-
>  gcc/config/ia64/ia64.md                       |  19 +++-
>  gcc/config/mips/mips.md                       |  22 +++-
>  gcc/config/pa/pa.md                           |  12 +-
>  gcc/config/rs6000/rs6000.md                   |  19 +++-
>  gcc/config/s390/s390.cc                       |  10 +-
>  gcc/config/s390/s390.md                       |  19 +++-
>  gcc/config/sh/sh.md                           |  15 ++-
>  gcc/config/sparc/sparc.md                     |  15 ++-
>  gcc/doc/rtl.texi                              |   6 +-
>  gcc/rtl.def                                   |   5 +-
>  gcc/rtlanal.cc                                |   2 +-
>  gcc/target-insns.def                          |   2 +-
>  .../execute/builtin-prefetch-1.c              |  45 ++++----
>  .../execute/builtin-prefetch-2.c              | 106 +++++++++---------
>  .../execute/builtin-prefetch-3.c              |  92 +++++++--------
>  .../execute/builtin-prefetch-4.c              |  44 ++++----
>  .../execute/builtin-prefetch-5.c              |  12 +-
>  .../execute/builtin-prefetch-6.c              |   4 +-
>  gcc/testsuite/gcc.dg/builtin-prefetch-1.c     |   5 +-
>  .../gcc.misc-tests/i386-pf-3dnow-1.c          |  16 +--
>  .../gcc.misc-tests/i386-pf-athlon-1.c         |  16 +--
>  gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c |  16 +--
>  gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c  |  16 +--
>  .../gcc.target/aarch64/prefetchi-1.c          |  11 ++
>  gcc/testsuite/gcc.target/alpha/prefetchi-1.c  |  11 ++
>  gcc/testsuite/gcc.target/arc/prefetchi-1.c    |  11 ++
>  gcc/testsuite/gcc.target/arm/prefetchi-1.c    |  11 ++
>  gcc/testsuite/gcc.target/hppa/prefetchi-1.c   |  11 ++
>  gcc/testsuite/gcc.target/i386/avx-1.c         |   2 +-
>  gcc/testsuite/gcc.target/i386/prefetchi-1.c   |  11 ++
>  gcc/testsuite/gcc.target/i386/sse-13.c        |   2 +-
>  gcc/testsuite/gcc.target/i386/sse-23.c        |   2 +-
>  gcc/testsuite/gcc.target/ia64/prefetchi-1.c   |  11 ++
>  gcc/testsuite/gcc.target/mips/prefetchi-1.c   |  11 ++
>  .../gcc.target/powerpc/prefetchi-1.c          |  11 ++
>  gcc/testsuite/gcc.target/s390/prefetchi-1.c   |  11 ++
>  gcc/testsuite/gcc.target/sh/prefetchi-1.c     |  11 ++
>  gcc/testsuite/gcc.target/sparc/prefetchi-1.c  |  11 ++
>  46 files changed, 564 insertions(+), 241 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/alpha/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/hppa/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/ia64/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/mips/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/sh/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/sparc/prefetchi-1.c
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index 5f319b28030..2e6d0c76beb 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -1282,18 +1282,18 @@ expand_builtin_update_setjmp_buf (rtx buf_addr)
>  static void
>  expand_builtin_prefetch (tree exp)
>  {
> -  tree arg0, arg1, arg2;
> +  tree arg0, arg1, arg2, arg3;
>    int nargs;
> -  rtx op0, op1, op2;
> +  rtx op0, op1, op2, op3;
>
>    if (!validate_arglist (exp, POINTER_TYPE, 0))
>      return;
>
>    arg0 = CALL_EXPR_ARG (exp, 0);
>
> -  /* Arguments 1 and 2 are optional; argument 1 (read/write) defaults to
> -     zero (read) and argument 2 (locality) defaults to 3 (high degree of
> -     locality).  */
> +  /* Arguments 1, 2, 3 are optional; argument 1 (read/write) defaults to
> +     zero (read); argument 2 (locality) defaults to 3 (high degree of
> +     locality); argument 3 (cache type) defaults to 1 (data).  */
>    nargs = call_expr_nargs (exp);
>    if (nargs > 1)
>      arg1 = CALL_EXPR_ARG (exp, 1);
> @@ -1303,6 +1303,10 @@ expand_builtin_prefetch (tree exp)
>      arg2 = CALL_EXPR_ARG (exp, 2);
>    else
>      arg2 = integer_three_node;
> +  if (nargs > 3)
> +    arg3 = CALL_EXPR_ARG (exp, 3);
> +  else
> +    arg3 = integer_one_node;
>
>    /* Argument 0 is an address.  */
>    op0 = expand_expr (arg0, NULL_RTX, Pmode, EXPAND_NORMAL);
> @@ -1336,14 +1340,30 @@ expand_builtin_prefetch (tree exp)
>        op2 = const0_rtx;
>      }
>
> +  /* Argument 3 (cache type) must be a compile-time constant int.  */
> +  if (TREE_CODE (arg3) != INTEGER_CST)
> +    {
> +      error ("fourth argument to %<__builtin_prefetch%> must be a constant");
> +      arg3 = integer_one_node;
> +    }
> +  op3 = expand_normal (arg3);
> +  /* Argument 3 must be either zero or one.  */
> +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> +    {
> +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> +       " using one");
> +      op3 = const1_rtx;
> +    }
> +
>    if (targetm.have_prefetch ())
>      {
> -      class expand_operand ops[3];
> +      class expand_operand ops[4];
>
>        create_address_operand (&ops[0], op0);
>        create_integer_operand (&ops[1], INTVAL (op1));
>        create_integer_operand (&ops[2], INTVAL (op2));
> -      if (maybe_expand_insn (targetm.code_for_prefetch, 3, ops))
> +      create_integer_operand (&ops[3], INTVAL (op3));
> +      if (maybe_expand_insn (targetm.code_for_prefetch, 4, ops))
>         return;
>      }
>
> diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md
> index e08bee197d8..0cde862bc04 100644
> --- a/gcc/config/aarch64/aarch64-sve.md
> +++ b/gcc/config/aarch64/aarch64-sve.md
> @@ -1944,7 +1944,8 @@
>                 (match_operand:DI 2 "const_int_operand")]
>                UNSPEC_SVE_PREFETCH)
>              (match_operand:DI 3 "const_int_operand")
> -            (match_operand:DI 4 "const_int_operand"))]
> +            (match_operand:DI 4 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_SVE"
>    {
>      operands[1] = gen_rtx_MEM (<MODE>mode, operands[1]);
> @@ -1984,7 +1985,8 @@
>                 (match_operand:DI 6 "const_int_operand")]
>                UNSPEC_SVE_PREFETCH_GATHER)
>              (match_operand:DI 7 "const_int_operand")
> -            (match_operand:DI 8 "const_int_operand"))]
> +            (match_operand:DI 8 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_SVE"
>    {
>      static const char *const insns[][2] = {
> @@ -2013,7 +2015,8 @@
>                 (match_operand:DI 6 "const_int_operand")]
>                UNSPEC_SVE_PREFETCH_GATHER)
>              (match_operand:DI 7 "const_int_operand")
> -            (match_operand:DI 8 "const_int_operand"))]
> +            (match_operand:DI 8 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_SVE"
>    {
>      static const char *const insns[][2] = {
> @@ -2044,7 +2047,8 @@
>                 (match_operand:DI 6 "const_int_operand")]
>                UNSPEC_SVE_PREFETCH_GATHER)
>              (match_operand:DI 7 "const_int_operand")
> -            (match_operand:DI 8 "const_int_operand"))]
> +            (match_operand:DI 8 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_SVE"
>    {
>      static const char *const insns[][2] = {
> @@ -2074,7 +2078,8 @@
>                 (match_operand:DI 6 "const_int_operand")]
>                UNSPEC_SVE_PREFETCH_GATHER)
>              (match_operand:DI 7 "const_int_operand")
> -            (match_operand:DI 8 "const_int_operand"))]
> +            (match_operand:DI 8 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_SVE"
>    {
>      static const char *const insns[][2] = {
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index f2e3d905dbb..94fa6b4200c 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -818,10 +818,25 @@
>    [(set_attr "type" "no_insn")]
>  )
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:DI 0 "aarch64_prefetch_operand")
> +            (match_operand:QI 1 "const_int_operand")
> +            (match_operand:QI 2 "const_int_operand")
> +           (match_operand:QI 3 "const_int_operand"))]
> +  ""
> +  {
> +    if (INTVAL (operands[3]) == 0)
> +    {
> +      warning (0, "instruction prefetch is not supported; using data prefetch");
> +      operands[3] = const1_rtx;
> +    }
> +  })
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand:DI 0 "aarch64_prefetch_operand" "Dp")
>              (match_operand:QI 1 "const_int_operand" "")
> -            (match_operand:QI 2 "const_int_operand" ""))]
> +            (match_operand:QI 2 "const_int_operand" "")
> +           (const_int 1))]
>    ""
>    {
>      const char * pftype[2][4] =
> diff --git a/gcc/config/alpha/alpha.md b/gcc/config/alpha/alpha.md
> index 87514330c22..46fd6a7b7cb 100644
> --- a/gcc/config/alpha/alpha.md
> +++ b/gcc/config/alpha/alpha.md
> @@ -5176,10 +5176,25 @@
>  ;;
>  ;; On EV6, these become official prefetch instructions.
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:DI 0 "address_operand")
> +            (match_operand:DI 1 "const_int_operand")
> +            (match_operand:DI 2 "const_int_operand")
> +            (match_operand:DI 3 "const_int_operand"))]
> +  "TARGET_FIXUP_EV5_PREFETCH || alpha_cpu == PROCESSOR_EV6"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand:DI 0 "address_operand" "p")
>              (match_operand:DI 1 "const_int_operand" "n")
> -            (match_operand:DI 2 "const_int_operand" "n"))]
> +            (match_operand:DI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "TARGET_FIXUP_EV5_PREFETCH || alpha_cpu == PROCESSOR_EV6"
>  {
>    /* Interpret "no temporal locality" as this data should be evicted once
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index 458d3edf716..9607a0dd572 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -5255,14 +5255,22 @@ archs4x, archs4xd"
>  (define_expand "prefetch"
>    [(prefetch (match_operand:SI 0 "address_operand" "")
>              (match_operand:SI 1 "const_int_operand" "")
> -            (match_operand:SI 2 "const_int_operand" ""))]
> +            (match_operand:SI 2 "const_int_operand" "")
> +            (match_operand:SI 3 "const_int_operand" ""))]
>    "TARGET_HS"
> -  "")
> +  {
> +    if (INTVAL (operands[3]) == 0)
> +    {
> +      warning (0, "instruction prefetch is not supported; using data prefetch");
> +      operands[3] = const1_rtx;
> +    }
> +  })
>
>  (define_insn "prefetch_1"
>    [(prefetch (match_operand:SI 0 "register_operand" "r")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "TARGET_HS"
>    {
>     if (INTVAL (operands[1]))
> @@ -5277,7 +5285,8 @@ archs4x, archs4xd"
>    [(prefetch (plus:SI (match_operand:SI 0 "register_operand" "r,r,r")
>                       (match_operand:SI 1 "nonmemory_operand" "r,Cm2,Cal"))
>              (match_operand:SI 2 "const_int_operand" "n,n,n")
> -            (match_operand:SI 3 "const_int_operand" "n,n,n"))]
> +            (match_operand:SI 3 "const_int_operand" "n,n,n")
> +            (const_int 1))]
>    "TARGET_HS"
>    {
>     if (INTVAL (operands[2]))
> @@ -5291,7 +5300,8 @@ archs4x, archs4xd"
>  (define_insn "prefetch_3"
>    [(prefetch (match_operand:SI 0 "address_operand" "p")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "TARGET_HS"
>    {
>     operands[0] = gen_rtx_MEM (SImode, operands[0]);
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 69bf343fb0e..7f2ec97406f 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -12206,10 +12206,25 @@
>
>  ;; V5E instructions.
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:SI 0 "address_operand")
> +            (match_operand:SI 1 "")
> +            (match_operand:SI 2 "")
> +            (match_operand:SI 3 ""))]
> +  "TARGET_32BIT && arm_arch5te"
> +  {
> +    if (INTVAL (operands[3]) == 0)
> +    {
> +      warning (0, "instruction prefetch is not supported; using data prefetch");
> +      operands[3] = const1_rtx;
> +    }
> +  })
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand:SI 0 "address_operand" "p")
>              (match_operand:SI 1 "" "")
> -            (match_operand:SI 2 "" ""))]
> +            (match_operand:SI 2 "" "")
> +            (const_int 1))]
>    "TARGET_32BIT && arm_arch5te"
>    "pld\\t%a0"
>    [(set_attr "type" "load_4")]
> diff --git a/gcc/config/frv/frv.md b/gcc/config/frv/frv.md
> index 6258fe3b99e..2fb9de593c9 100644
> --- a/gcc/config/frv/frv.md
> +++ b/gcc/config/frv/frv.md
> @@ -7631,7 +7631,8 @@
>    [(prefetch (unspec:SI [(match_operand:SI 0 "register_operand" "r")]
>                         UNSPEC_PREFETCH0)
>              (const_int 0)
> -            (const_int 0))]
> +            (const_int 0)
> +            (const_int 1))]
>    ""
>    "dcpl %0, gr0, #0"
>    [(set_attr "length" "4")])
> @@ -7640,7 +7641,8 @@
>    [(prefetch (unspec:SI [(match_operand:SI 0 "register_operand" "r")]
>                         UNSPEC_PREFETCH)
>              (const_int 0)
> -            (const_int 0))]
> +            (const_int 0)
> +            (const_int 1))]
>    "TARGET_FR500_FR550_BUILTINS"
>    "nop.p\\n\\tnldub @(%0, gr0), gr0"
>    [(set_attr "length" "8")])
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 8e847520491..c65cf14b9f4 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -23635,9 +23635,15 @@
>  (define_expand "prefetch"
>    [(prefetch (match_operand 0 "address_operand")
>              (match_operand:SI 1 "const_int_operand")
> -            (match_operand:SI 2 "const_int_operand"))]
> +            (match_operand:SI 2 "const_int_operand")
> +            (match_operand:SI 3 "const_int_operand"))]
>    "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1"
>  {
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
>    bool write = operands[1] != const0_rtx;
>    int locality = INTVAL (operands[2]);
>
> @@ -23679,7 +23685,8 @@
>  (define_insn "*prefetch_sse"
>    [(prefetch (match_operand 0 "address_operand" "p")
>              (const_int 0)
> -            (match_operand:SI 1 "const_int_operand"))]
> +            (match_operand:SI 1 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_PREFETCH_SSE"
>  {
>    static const char * const patterns[4] = {
> @@ -23700,7 +23707,8 @@
>  (define_insn "*prefetch_3dnow"
>    [(prefetch (match_operand 0 "address_operand" "p")
>              (match_operand:SI 1 "const_int_operand")
> -            (const_int 3))]
> +            (const_int 3)
> +            (const_int 1))]
>    "TARGET_3DNOW || TARGET_PRFCHW || TARGET_PREFETCHWT1"
>  {
>    if (operands[1] == const0_rtx)
> @@ -23716,7 +23724,8 @@
>  (define_insn "*prefetch_prefetchwt1"
>    [(prefetch (match_operand 0 "address_operand" "p")
>              (const_int 1)
> -            (const_int 2))]
> +            (const_int 2)
> +            (const_int 1))]
>    "TARGET_PREFETCHWT1"
>    "prefetchwt1\t%a0";
>    [(set_attr "type" "sse")
> diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
> index 5d1d47da55b..9fbbea3412a 100644
> --- a/gcc/config/ia64/ia64.md
> +++ b/gcc/config/ia64/ia64.md
> @@ -5018,10 +5018,25 @@
>    "break.f 0"
>    [(set_attr "itanium_class" "nop_f")])
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:DI 0 "address_operand")
> +            (match_operand:DI 1 "const_int_operand")
> +            (match_operand:DI 2 "const_int_operand")
> +            (match_operand:DI 3 "const_int_operand"))]
> +  ""
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand:DI 0 "address_operand" "p")
>              (match_operand:DI 1 "const_int_operand" "n")
> -            (match_operand:DI 2 "const_int_operand" "n"))]
> +            (match_operand:DI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    ""
>  {
>    static const char * const alt[2][4] = {
> diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
> index e0f0a582732..b5c547806b4 100644
> --- a/gcc/config/mips/mips.md
> +++ b/gcc/config/mips/mips.md
> @@ -7227,10 +7227,25 @@
>  ;;
>
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:QI 0 "address_operand")
> +            (match_operand 1 "const_int_operand")
> +            (match_operand 2 "const_int_operand")
> +            (match_operand 3 "const_int_operand"))]
> +  "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand:QI 0 "address_operand" "ZD")
>              (match_operand 1 "const_int_operand" "n")
> -            (match_operand 2 "const_int_operand" "n"))]
> +            (match_operand 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
>  {
>    if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT)
> @@ -7257,7 +7272,8 @@
>    [(prefetch (plus:P (match_operand:P 0 "register_operand" "d")
>                      (match_operand:P 1 "register_operand" "d"))
>              (match_operand 2 "const_int_operand" "n")
> -            (match_operand 3 "const_int_operand" "n"))]
> +            (match_operand 3 "const_int_operand" "n")
> +            (const_int 1))]
>    "ISA_HAS_PREFETCHX && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT"
>  {
>    if (TARGET_LOONGSON_EXT)
> diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
> index 76ae35d4cfa..a7469074c01 100644
> --- a/gcc/config/pa/pa.md
> +++ b/gcc/config/pa/pa.md
> @@ -10201,9 +10201,16 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
>  (define_expand "prefetch"
>    [(match_operand 0 "address_operand" "")
>     (match_operand 1 "const_int_operand" "")
> -   (match_operand 2 "const_int_operand" "")]
> +   (match_operand 2 "const_int_operand" "")
> +   (match_operand 3 "const_int_operand" "")]
>    "TARGET_PA_20"
>  {
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +
>    operands[0] = copy_addr_to_reg (operands[0]);
>    emit_insn (gen_prefetch_20 (operands[0], operands[1], operands[2]));
>    DONE;
> @@ -10212,7 +10219,8 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
>  (define_insn "prefetch_20"
>    [(prefetch (match_operand 0 "pmode_register_operand" "r")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "TARGET_PA_20"
>  {
>    /* The SL cache-control completer indicates good spatial locality but
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index ad5a4cf2ef8..21ff09eca93 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -14060,10 +14060,25 @@
>    DONE;
>  })
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand 0 "indexed_or_indirect_address")
> +            (match_operand:SI 1 "const_int_operand")
> +            (match_operand:SI 2 "const_int_operand")
> +            (match_operand:SI 3 "const_int_operand"))]
> +  ""
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand 0 "indexed_or_indirect_address" "a")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    ""
>  {
>
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index ae309471f04..3fc5ae196b8 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -5697,13 +5697,13 @@ s390_expand_cpymem (rtx dst, rtx src, rtx len)
>
>           /* Issue a read prefetch for the +3 cache line.  */
>           prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, src_addr, GEN_INT (768)),
> -                                  const0_rtx, const0_rtx);
> +                                  const0_rtx, const0_rtx, const1_rtx);
>           PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>           emit_insn (prefetch);
>
>           /* Issue a write prefetch for the +3 cache line.  */
>           prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, GEN_INT (768)),
> -                                  const1_rtx, const0_rtx);
> +                                  const1_rtx, const0_rtx, const1_rtx);
>           PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>           emit_insn (prefetch);
>         }
> @@ -5872,7 +5872,7 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
>           /* Issue a write prefetch.  */
>           rtx distance = GEN_INT (TARGET_SETMEM_PREFETCH_DISTANCE);
>           rtx prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, distance),
> -                                      const1_rtx, const0_rtx);
> +                                      const1_rtx, const0_rtx, const1_rtx);
>           emit_insn (prefetch);
>           PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>         }
> @@ -5999,13 +5999,13 @@ s390_expand_cmpmem (rtx target, rtx op0, rtx op1, rtx len)
>
>           /* Issue a read prefetch for the +2 cache line of operand 1.  */
>           prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, addr0, GEN_INT (512)),
> -                                  const0_rtx, const0_rtx);
> +                                  const0_rtx, const0_rtx, const1_rtx);
>           emit_insn (prefetch);
>           PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>
>           /* Issue a read prefetch for the +2 cache line of operand 2.  */
>           prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, addr1, GEN_INT (512)),
> -                                  const0_rtx, const0_rtx);
> +                                  const0_rtx, const0_rtx, const1_rtx);
>           emit_insn (prefetch);
>           PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>         }
> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
> index 962927c3112..4b094aa2bcf 100644
> --- a/gcc/config/s390/s390.md
> +++ b/gcc/config/s390/s390.md
> @@ -11601,10 +11601,25 @@
>  ; Data prefetch patterns
>  ;
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand 0    "address_operand")
> +            (match_operand:SI 1 "const_int_operand")
> +            (match_operand:SI 2 "const_int_operand")
> +             (match_operand:SI 3 "const_int_operand"))]
> +  "TARGET_Z10"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand 0    "address_operand"   "ZT,X")
>              (match_operand:SI 1 "const_int_operand" " n,n")
> -            (match_operand:SI 2 "const_int_operand" " n,n"))]
> +            (match_operand:SI 2 "const_int_operand" " n,n")
> +             (const_int 1))]
>    "TARGET_Z10"
>  {
>    switch (which_alternative)
> diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
> index 59a7b216433..54a8270e80e 100644
> --- a/gcc/config/sh/sh.md
> +++ b/gcc/config/sh/sh.md
> @@ -10928,13 +10928,22 @@
>  (define_expand "prefetch"
>    [(prefetch (match_operand 0 "address_operand" "")
>              (match_operand:SI 1 "const_int_operand" "")
> -            (match_operand:SI 2 "const_int_operand" ""))]
> -  "(TARGET_SH2A || TARGET_SH3) && !TARGET_VXWORKS_RTP")
> +            (match_operand:SI 2 "const_int_operand" "")
> +            (match_operand:SI 3 "const_int_operand" ""))]
> +  "(TARGET_SH2A || TARGET_SH3) && !TARGET_VXWORKS_RTP"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
>
>  (define_insn "*prefetch"
>    [(prefetch (match_operand:SI 0 "register_operand" "r")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "(TARGET_SH2A || TARGET_SH3) && ! TARGET_VXWORKS_RTP"
>    "pref        @%0"
>    [(set_attr "type" "other")])
> diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
> index 691e707863a..04cb6935b1b 100644
> --- a/gcc/config/sparc/sparc.md
> +++ b/gcc/config/sparc/sparc.md
> @@ -7816,9 +7816,16 @@ visl")
>  (define_expand "prefetch"
>    [(match_operand 0 "address_operand" "")
>     (match_operand 1 "const_int_operand" "")
> -   (match_operand 2 "const_int_operand" "")]
> +   (match_operand 2 "const_int_operand" "")
> +   (match_operand 3 "const_int_operand" "")]
>    "TARGET_V9"
>  {
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +
>    if (TARGET_ARCH64)
>      emit_insn (gen_prefetch_64 (operands[0], operands[1], operands[2]));
>    else
> @@ -7829,7 +7836,8 @@ visl")
>  (define_insn "prefetch_64"
>    [(prefetch (match_operand:DI 0 "address_operand" "p")
>              (match_operand:DI 1 "const_int_operand" "n")
> -            (match_operand:DI 2 "const_int_operand" "n"))]
> +            (match_operand:DI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    ""
>  {
>    static const char * const prefetch_instr[2][2] = {
> @@ -7855,7 +7863,8 @@ visl")
>  (define_insn "prefetch_32"
>    [(prefetch (match_operand:SI 0 "address_operand" "p")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    ""
>  {
>    static const char * const prefetch_instr[2][2] = {
> diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
> index 43c9ee8bffe..592f4b0e4dd 100644
> --- a/gcc/doc/rtl.texi
> +++ b/gcc/doc/rtl.texi
> @@ -3454,7 +3454,7 @@ position of @var{base}, @var{min} and @var{max} to the containing insn
>  and of @var{min} and @var{max} to @var{base}.  See rtl.def for details.
>
>  @findex prefetch
> -@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality})
> +@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality} @var{cache})
>  Represents prefetch of memory at address @var{addr}.
>  Operand @var{rw} is 1 if the prefetch is for data to be written, 0 otherwise;
>  targets that do not support write prefetches should treat this as a normal
> @@ -3462,6 +3462,10 @@ prefetch.
>  Operand @var{locality} specifies the amount of temporal locality; 0 if there
>  is none or 1, 2, or 3 for increasing levels of temporal locality;
>  targets that do not support locality hints should ignore this.
> +Operand @var{cache} is 1 if the prefetch is prefetching data, 0 for prefetching
> +instruction;
> +targets that do not support instruction prefetch should treat all as data
> +prefetch.
>
>  This insn is used to minimize cache-miss latency by moving data into a
>  cache before it is accessed.  It should use only non-faulting data prefetch
> diff --git a/gcc/rtl.def b/gcc/rtl.def
> index 08e31fa3544..f2e37d55023 100644
> --- a/gcc/rtl.def
> +++ b/gcc/rtl.def
> @@ -277,10 +277,11 @@ DEF_RTL_EXPR(ADDR_DIFF_VEC, "addr_diff_vec", "eEee0", RTX_EXTRA)
>     Operand 3 is the level of temporal locality; 0 means there is no
>     temporal locality and 1, 2, and 3 are for increasing levels of temporal
>     locality.
> +   Operand 4 is 1 for prefetch data, 0 for prefetch instrction.
>
> -   The attributes specified by operands 2 and 3 are ignored for targets
> +   The attributes specified by operands 2, 3 and 4 are ignored for targets
>     whose prefetch instructions do not support them.  */
> -DEF_RTL_EXPR(PREFETCH, "prefetch", "eee", RTX_EXTRA)
> +DEF_RTL_EXPR(PREFETCH, "prefetch", "eeee", RTX_EXTRA)
>
>  /* ----------------------------------------------------------------------
>     At the top level of an instruction (perhaps under PARALLEL).
> diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
> index 56da7435a28..7eeef285f1e 100644
> --- a/gcc/rtlanal.cc
> +++ b/gcc/rtlanal.cc
> @@ -6196,7 +6196,7 @@ setup_reg_subrtx_bounds (unsigned int code)
>    while (format[i] == 'e');
>    rtx_all_subrtx_bounds[code].count = i - rtx_all_subrtx_bounds[code].start;
>    /* rtl-iter.h relies on this.  */
> -  gcc_checking_assert (rtx_all_subrtx_bounds[code].count <= 3);
> +  gcc_checking_assert (rtx_all_subrtx_bounds[code].count <= 4);
>
>    for (; format[i]; ++i)
>      if (format[i] == 'E' || format[i] == 'V' || format[i] == 'e')
> diff --git a/gcc/target-insns.def b/gcc/target-insns.def
> index de8c0092f98..ca13d1c4393 100644
> --- a/gcc/target-insns.def
> +++ b/gcc/target-insns.def
> @@ -76,7 +76,7 @@ DEF_TARGET_INSN (omp_simt_ordered, (rtx x0, rtx x1))
>  DEF_TARGET_INSN (omp_simt_vote_any, (rtx x0, rtx x1))
>  DEF_TARGET_INSN (omp_simt_xchg_bfly, (rtx x0, rtx x1, rtx x2))
>  DEF_TARGET_INSN (omp_simt_xchg_idx, (rtx x0, rtx x1, rtx x2))
> -DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2))
> +DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2, rtx x3))
>  DEF_TARGET_INSN (probe_stack, (rtx x0))
>  DEF_TARGET_INSN (probe_stack_address, (rtx x0))
>  DEF_TARGET_INSN (prologue, (void))
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
> index 4ee05a94d9f..ccc5fab15e5 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
> @@ -1,57 +1,62 @@
>  /* Test that __builtin_prefetch does no harm.
>
> -   Prefetch using all valid combinations of rw and locality values.
> +   Prefetch using all valid combinations of cache, rw and locality values.
>     These must be compile-time constants.  */
>
>  #define NO_TEMPORAL_LOCALITY 0
>  #define LOW_TEMPORAL_LOCALITY 1
> -#define MODERATE_TEMPORAL_LOCALITY 1
> +#define MODERATE_TEMPORAL_LOCALITY 2
>  #define HIGH_TEMPORAL_LOCALITY 3
>
>  #define WRITE_ACCESS 1
>  #define READ_ACCESS 0
>
> +#define DATA_PRFCH 1
> +#define INST_PRFCH 0
> +
>  enum locality { none, low, moderate, high };
>  enum rw { read, write };
> +enum cache { inst, data };
>
>  int arr[10];
>
>  void
>  good_const (const int *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, READ_ACCESS, 3);
> -  __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY);
> -  __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY);
> -  __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY);
> -  __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, READ_ACCESS, 3, 1);
> +  __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY, 1);
> +  __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY, 1);
> +  __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY, 1);
> +  __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY, DATA_PRFCH);
>  }
>
>  void
>  good_enum (const int *p)
>  {
> -    __builtin_prefetch (p, read, none);
> -    __builtin_prefetch (p, read, low);
> -    __builtin_prefetch (p, read, moderate);
> -    __builtin_prefetch (p, read, high);
> -    __builtin_prefetch (p, write, none);
> -    __builtin_prefetch (p, write, low);
> -    __builtin_prefetch (p, write, moderate);
> -    __builtin_prefetch (p, write, high);
> +    __builtin_prefetch (p, read, none, data);
> +    __builtin_prefetch (p, read, low, data);
> +    __builtin_prefetch (p, read, moderate, data);
> +    __builtin_prefetch (p, read, high, data);
> +    __builtin_prefetch (p, write, none, data);
> +    __builtin_prefetch (p, write, low, data);
> +    __builtin_prefetch (p, write, moderate, data);
> +    __builtin_prefetch (p, write, high, data);
>  }
>
>  void
>  good_expr (const int *p)
>  {
> -  __builtin_prefetch (p, 1 - 1, 6 - (2 * 3));
> -  __builtin_prefetch (p, 1 + 0, 1 + 2);
> +  __builtin_prefetch (p, 1 - 1, 6 - (2 * 3), 1 + 0);
> +  __builtin_prefetch (p, 1 + 0, 1 + 2, 0 + 1);
>  }
>
>  void
>  good_vararg (const int *p)
>  {
> +  __builtin_prefetch (p, 0, 3, 1);
>    __builtin_prefetch (p, 0, 3);
>    __builtin_prefetch (p, 0);
>    __builtin_prefetch (p, 1);
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
> index 530a1b0ef0d..6aff1f281e0 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
> @@ -26,9 +26,9 @@ struct S *ptr_str = &str;
>  void
>  simple_global ()
>  {
> -  __builtin_prefetch (glob_int_arr, 0, 0);
> -  __builtin_prefetch (glob_ptr_int, 0, 0);
> -  __builtin_prefetch (&glob_int, 0, 0);
> +  __builtin_prefetch (glob_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_int, 0, 0, 1);
>  }
>
>  /* Prefetch file-level static variables using the address of the variable.  */
> @@ -36,9 +36,9 @@ simple_global ()
>  void
>  simple_file ()
>  {
> -  __builtin_prefetch (stat_int_arr, 0, 0);
> -  __builtin_prefetch (stat_ptr_int, 0, 0);
> -  __builtin_prefetch (&stat_int, 0, 0);
> +  __builtin_prefetch (stat_int_arr, 0, 0, 1);
> +  __builtin_prefetch (stat_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (&stat_int, 0, 0, 1);
>  }
>
>  /* Prefetch local static variables using the address of the variable.  */
> @@ -49,9 +49,9 @@ simple_static_local ()
>    static int gx[100];
>    static int *hx = gx;
>    static int ix;
> -  __builtin_prefetch (gx, 0, 0);
> -  __builtin_prefetch (hx, 0, 0);
> -  __builtin_prefetch (&ix, 0, 0);
> +  __builtin_prefetch (gx, 0, 0, 1);
> +  __builtin_prefetch (hx, 0, 0, 1);
> +  __builtin_prefetch (&ix, 0, 0, 1);
>  }
>
>  /* Prefetch local stack variables using the address of the variable.  */
> @@ -62,9 +62,9 @@ simple_local ()
>    int gx[100];
>    int *hx = gx;
>    int ix;
> -  __builtin_prefetch (gx, 0, 0);
> -  __builtin_prefetch (hx, 0, 0);
> -  __builtin_prefetch (&ix, 0, 0);
> +  __builtin_prefetch (gx, 0, 0, 1);
> +  __builtin_prefetch (hx, 0, 0, 1);
> +  __builtin_prefetch (&ix, 0, 0, 1);
>  }
>
>  /* Prefetch arguments using the address of the variable.  */
> @@ -72,9 +72,9 @@ simple_local ()
>  void
>  simple_arg (int g[100], int *h, int i)
>  {
> -  __builtin_prefetch (g, 0, 0);
> -  __builtin_prefetch (h, 0, 0);
> -  __builtin_prefetch (&i, 0, 0);
> +  __builtin_prefetch (g, 0, 0, 1);
> +  __builtin_prefetch (h, 0, 0, 1);
> +  __builtin_prefetch (&i, 0, 0, 1);
>  }
>
>  /* Prefetch using address expressions involving global variables.  */
> @@ -82,25 +82,25 @@ simple_arg (int g[100], int *h, int i)
>  void
>  expr_global (void)
>  {
> -  __builtin_prefetch (&str, 0, 0);
> -  __builtin_prefetch (ptr_str, 0, 0);
> -  __builtin_prefetch (&str.b, 0, 0);
> -  __builtin_prefetch (&ptr_str->b, 0, 0);
> -  __builtin_prefetch (&str.d, 0, 0);
> -  __builtin_prefetch (&ptr_str->d, 0, 0);
> -  __builtin_prefetch (str.next, 0, 0);
> -  __builtin_prefetch (ptr_str->next, 0, 0);
> -  __builtin_prefetch (str.next->d, 0, 0);
> -  __builtin_prefetch (ptr_str->next->d, 0, 0);
> -
> -  __builtin_prefetch (&glob_int_arr, 0, 0);
> -  __builtin_prefetch (glob_ptr_int, 0, 0);
> -  __builtin_prefetch (&glob_int_arr[2], 0, 0);
> -  __builtin_prefetch (&glob_ptr_int[3], 0, 0);
> -  __builtin_prefetch (glob_int_arr+3, 0, 0);
> -  __builtin_prefetch (glob_int_arr+glob_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_int+5, 0, 0);
> -  __builtin_prefetch (glob_ptr_int+glob_int, 0, 0);
> +  __builtin_prefetch (&str, 0, 0, 1);
> +  __builtin_prefetch (ptr_str, 0, 0, 1);
> +  __builtin_prefetch (&str.b, 0, 0, 1);
> +  __builtin_prefetch (&ptr_str->b, 0, 0, 1);
> +  __builtin_prefetch (&str.d, 0, 0, 1);
> +  __builtin_prefetch (&ptr_str->d, 0, 0, 1);
> +  __builtin_prefetch (str.next, 0, 0, 1);
> +  __builtin_prefetch (ptr_str->next, 0, 0, 1);
> +  __builtin_prefetch (str.next->d, 0, 0, 1);
> +  __builtin_prefetch (ptr_str->next->d, 0, 0, 1);
> +
> +  __builtin_prefetch (&glob_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_int_arr[2], 0, 0, 1);
> +  __builtin_prefetch (&glob_ptr_int[3], 0, 0, 1);
> +  __builtin_prefetch (glob_int_arr+3, 0, 0, 1);
> +  __builtin_prefetch (glob_int_arr+glob_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int+glob_int, 0, 0, 1);
>  }
>
>  /* Prefetch using address expressions involving local variables.  */
> @@ -114,25 +114,25 @@ expr_local (void)
>    struct S *pt = &t;
>    int j = 4;
>
> -  __builtin_prefetch (&t, 0, 0);
> -  __builtin_prefetch (pt, 0, 0);
> -  __builtin_prefetch (&t.b, 0, 0);
> -  __builtin_prefetch (&pt->b, 0, 0);
> -  __builtin_prefetch (&t.d, 0, 0);
> -  __builtin_prefetch (&pt->d, 0, 0);
> -  __builtin_prefetch (t.next, 0, 0);
> -  __builtin_prefetch (pt->next, 0, 0);
> -  __builtin_prefetch (t.next->d, 0, 0);
> -  __builtin_prefetch (pt->next->d, 0, 0);
> -
> -  __builtin_prefetch (&b, 0, 0);
> -  __builtin_prefetch (pb, 0, 0);
> -  __builtin_prefetch (&b[2], 0, 0);
> -  __builtin_prefetch (&pb[3], 0, 0);
> -  __builtin_prefetch (b+3, 0, 0);
> -  __builtin_prefetch (b+j, 0, 0);
> -  __builtin_prefetch (pb+5, 0, 0);
> -  __builtin_prefetch (pb+j, 0, 0);
> +  __builtin_prefetch (&t, 0, 0, 1);
> +  __builtin_prefetch (pt, 0, 0, 1);
> +  __builtin_prefetch (&t.b, 0, 0, 1);
> +  __builtin_prefetch (&pt->b, 0, 0, 1);
> +  __builtin_prefetch (&t.d, 0, 0, 1);
> +  __builtin_prefetch (&pt->d, 0, 0, 1);
> +  __builtin_prefetch (t.next, 0, 0, 1);
> +  __builtin_prefetch (pt->next, 0, 0, 1);
> +  __builtin_prefetch (t.next->d, 0, 0, 1);
> +  __builtin_prefetch (pt->next->d, 0, 0, 1);
> +
> +  __builtin_prefetch (&b, 0, 0, 1);
> +  __builtin_prefetch (pb, 0, 0, 1);
> +  __builtin_prefetch (&b[2], 0, 0, 1);
> +  __builtin_prefetch (&pb[3], 0, 0, 1);
> +  __builtin_prefetch (b+3, 0, 0, 1);
> +  __builtin_prefetch (b+j, 0, 0, 1);
> +  __builtin_prefetch (pb+5, 0, 0, 1);
> +  __builtin_prefetch (pb+j, 0, 0, 1);
>  }
>
>  int
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
> index 2e2e808c172..38ce410384a 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
> @@ -36,11 +36,11 @@ volatile struct S * volatile vol_ptr_vol_str = &vol_str;
>  void
>  simple_vol_global ()
>  {
> -  __builtin_prefetch (glob_vol_int_arr, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (&glob_vol_int, 0, 0);
> +  __builtin_prefetch (glob_vol_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_int, 0, 0, 1);
>  }
>
>  /* Prefetch volatile static variables using the address of the variable.  */
> @@ -48,11 +48,11 @@ simple_vol_global ()
>  void
>  simple_vol_file ()
>  {
> -  __builtin_prefetch (stat_vol_int_arr, 0, 0);
> -  __builtin_prefetch (stat_vol_ptr_int, 0, 0);
> -  __builtin_prefetch (stat_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (&stat_vol_int, 0, 0);
> +  __builtin_prefetch (stat_vol_int_arr, 0, 0, 1);
> +  __builtin_prefetch (stat_vol_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (stat_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (&stat_vol_int, 0, 0, 1);
>  }
>
>  /* Prefetch using address expressions involving volatile global variables.  */
> @@ -60,43 +60,43 @@ simple_vol_file ()
>  void
>  expr_vol_global (void)
>  {
> -  __builtin_prefetch (&vol_str, 0, 0);
> -  __builtin_prefetch (ptr_vol_str, 0, 0);
> -  __builtin_prefetch (vol_ptr_str, 0, 0);
> -  __builtin_prefetch (vol_ptr_vol_str, 0, 0);
> -  __builtin_prefetch (&vol_str.b, 0, 0);
> -  __builtin_prefetch (&ptr_vol_str->b, 0, 0);
> -  __builtin_prefetch (&vol_ptr_str->b, 0, 0);
> -  __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0);
> -  __builtin_prefetch (&vol_str.d, 0, 0);
> -  __builtin_prefetch (&vol_ptr_str->d, 0, 0);
> -  __builtin_prefetch (&ptr_vol_str->d, 0, 0);
> -  __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0);
> -  __builtin_prefetch (vol_str.next, 0, 0);
> -  __builtin_prefetch (vol_ptr_str->next, 0, 0);
> -  __builtin_prefetch (ptr_vol_str->next, 0, 0);
> -  __builtin_prefetch (vol_ptr_vol_str->next, 0, 0);
> -  __builtin_prefetch (vol_str.next->d, 0, 0);
> -  __builtin_prefetch (vol_ptr_str->next->d, 0, 0);
> -  __builtin_prefetch (ptr_vol_str->next->d, 0, 0);
> -  __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0);
> +  __builtin_prefetch (&vol_str, 0, 0, 1);
> +  __builtin_prefetch (ptr_vol_str, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_str, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_vol_str, 0, 0, 1);
> +  __builtin_prefetch (&vol_str.b, 0, 0, 1);
> +  __builtin_prefetch (&ptr_vol_str->b, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_str->b, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0, 1);
> +  __builtin_prefetch (&vol_str.d, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_str->d, 0, 0, 1);
> +  __builtin_prefetch (&ptr_vol_str->d, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0, 1);
> +  __builtin_prefetch (vol_str.next, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_str->next, 0, 0, 1);
> +  __builtin_prefetch (ptr_vol_str->next, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_vol_str->next, 0, 0, 1);
> +  __builtin_prefetch (vol_str.next->d, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_str->next->d, 0, 0, 1);
> +  __builtin_prefetch (ptr_vol_str->next->d, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0, 1);
>
> -  __builtin_prefetch (&glob_vol_int_arr, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (&glob_vol_int_arr[2], 0, 0);
> -  __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0);
> -  __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0);
> -  __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0);
> -  __builtin_prefetch (glob_vol_int_arr+3, 0, 0);
> -  __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int+5, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int+5, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0);
> +  __builtin_prefetch (&glob_vol_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_int_arr[2], 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0, 1);
> +  __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0, 1);
> +  __builtin_prefetch (glob_vol_int_arr+3, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0, 1);
>  }
>
>  int
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
> index ade892b21a7..69b4cbe1854 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
> @@ -17,7 +17,7 @@ int
>  assign_arg_ptr (int *p)
>  {
>    int *q;
> -  __builtin_prefetch ((q = p), 0, 0);
> +  __builtin_prefetch ((q = p), 0, 0, 1);
>    return q == p;
>  }
>
> @@ -25,7 +25,7 @@ int
>  assign_glob_ptr (void)
>  {
>    int *q;
> -  __builtin_prefetch ((q = ptr), 0, 0);
> +  __builtin_prefetch ((q = ptr), 0, 0, 1);
>    return q == ptr;
>  }
>
> @@ -33,7 +33,7 @@ int
>  assign_arg_idx (int *p, int i)
>  {
>    int j;
> -  __builtin_prefetch (&p[j = i], 0, 0);
> +  __builtin_prefetch (&p[j = i], 0, 0, 1);
>    return j == i;
>  }
>
> @@ -41,7 +41,7 @@ int
>  assign_glob_idx (void)
>  {
>    int j;
> -  __builtin_prefetch (&ptr[j = arrindex], 0, 0);
> +  __builtin_prefetch (&ptr[j = arrindex], 0, 0, 1);
>    return j == arrindex;
>  }
>
> @@ -53,7 +53,7 @@ preinc_arg_ptr (int *p)
>  {
>    int *q;
>    q = p + 1;
> -  __builtin_prefetch (++p, 0, 0);
> +  __builtin_prefetch (++p, 0, 0, 1);
>    return p == q;
>  }
>
> @@ -62,7 +62,7 @@ preinc_glob_ptr (void)
>  {
>    int *q;
>    q = ptr + 1;
> -  __builtin_prefetch (++ptr, 0, 0);
> +  __builtin_prefetch (++ptr, 0, 0, 1);
>    return ptr == q;
>  }
>
> @@ -71,7 +71,7 @@ postinc_arg_ptr (int *p)
>  {
>    int *q;
>    q = p + 1;
> -  __builtin_prefetch (p++, 0, 0);
> +  __builtin_prefetch (p++, 0, 0, 1);
>    return p == q;
>  }
>
> @@ -80,7 +80,7 @@ postinc_glob_ptr (void)
>  {
>    int *q;
>    q = ptr + 1;
> -  __builtin_prefetch (ptr++, 0, 0);
> +  __builtin_prefetch (ptr++, 0, 0, 1);
>    return ptr == q;
>  }
>
> @@ -89,7 +89,7 @@ predec_arg_ptr (int *p)
>  {
>    int *q;
>    q = p - 1;
> -  __builtin_prefetch (--p, 0, 0);
> +  __builtin_prefetch (--p, 0, 0, 1);
>    return p == q;
>  }
>
> @@ -98,7 +98,7 @@ predec_glob_ptr (void)
>  {
>    int *q;
>    q = ptr - 1;
> -  __builtin_prefetch (--ptr, 0, 0);
> +  __builtin_prefetch (--ptr, 0, 0, 1);
>    return ptr == q;
>  }
>
> @@ -107,7 +107,7 @@ postdec_arg_ptr (int *p)
>  {
>    int *q;
>    q = p - 1;
> -  __builtin_prefetch (p--, 0, 0);
> +  __builtin_prefetch (p--, 0, 0, 1);
>    return p == q;
>  }
>
> @@ -116,7 +116,7 @@ postdec_glob_ptr (void)
>  {
>    int *q;
>    q = ptr - 1;
> -  __builtin_prefetch (ptr--, 0, 0);
> +  __builtin_prefetch (ptr--, 0, 0, 1);
>    return ptr == q;
>  }
>
> @@ -124,7 +124,7 @@ int
>  preinc_arg_idx (int *p, int i)
>  {
>    int j = i + 1;
> -  __builtin_prefetch (&p[++i], 0, 0);
> +  __builtin_prefetch (&p[++i], 0, 0, 1);
>    return i == j;
>  }
>
> @@ -133,7 +133,7 @@ int
>  preinc_glob_idx (void)
>  {
>    int j = arrindex + 1;
> -  __builtin_prefetch (&ptr[++arrindex], 0, 0);
> +  __builtin_prefetch (&ptr[++arrindex], 0, 0, 1);
>    return arrindex == j;
>  }
>
> @@ -141,7 +141,7 @@ int
>  postinc_arg_idx (int *p, int i)
>  {
>    int j = i + 1;
> -  __builtin_prefetch (&p[i++], 0, 0);
> +  __builtin_prefetch (&p[i++], 0, 0, 1);
>    return i == j;
>  }
>
> @@ -149,7 +149,7 @@ int
>  postinc_glob_idx (void)
>  {
>    int j = arrindex + 1;
> -  __builtin_prefetch (&ptr[arrindex++], 0, 0);
> +  __builtin_prefetch (&ptr[arrindex++], 0, 0, 1);
>    return arrindex == j;
>  }
>
> @@ -157,7 +157,7 @@ int
>  predec_arg_idx (int *p, int i)
>  {
>    int j = i - 1;
> -  __builtin_prefetch (&p[--i], 0, 0);
> +  __builtin_prefetch (&p[--i], 0, 0, 1);
>    return i == j;
>  }
>
> @@ -165,7 +165,7 @@ int
>  predec_glob_idx (void)
>  {
>    int j = arrindex - 1;
> -  __builtin_prefetch (&ptr[--arrindex], 0, 0);
> +  __builtin_prefetch (&ptr[--arrindex], 0, 0, 1);
>    return arrindex == j;
>  }
>
> @@ -173,7 +173,7 @@ int
>  postdec_arg_idx (int *p, int i)
>  {
>    int j = i - 1;
> -  __builtin_prefetch (&p[i--], 0, 0);
> +  __builtin_prefetch (&p[i--], 0, 0, 1);
>    return i == j;
>  }
>
> @@ -181,7 +181,7 @@ int
>  postdec_glob_idx (void)
>  {
>    int j = arrindex - 1;
> -  __builtin_prefetch (&ptr[arrindex--], 0, 0);
> +  __builtin_prefetch (&ptr[arrindex--], 0, 0, 1);
>    return arrindex == j;
>  }
>
> @@ -200,7 +200,7 @@ getptr (int *p)
>  int
>  funccall_arg_ptr (int *p)
>  {
> -  __builtin_prefetch (getptr (p), 0, 0);
> +  __builtin_prefetch (getptr (p), 0, 0, 1);
>    return getptrcnt == 1;
>  }
>
> @@ -216,7 +216,7 @@ getint (int i)
>  int
>  funccall_arg_idx (int *p, int i)
>  {
> -  __builtin_prefetch (&p[getint (i)], 0, 0);
> +  __builtin_prefetch (&p[getint (i)], 0, 0, 1);
>    return getintcnt == 1;
>  }
>
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
> index f42a2c0ca87..a6fa1741888 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
> @@ -18,32 +18,32 @@ int idx = 3;
>  void
>  arg_ptr (char *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> +  __builtin_prefetch (p, 0, 0, 1);
>  }
>
>  void
>  arg_idx (char *p, int i)
>  {
> -  __builtin_prefetch (&p[i], 0, 0);
> +  __builtin_prefetch (&p[i], 0, 0, 1);
>  }
>
>  void
>  glob_ptr (void)
>  {
> -  __builtin_prefetch (ptr, 0, 0);
> +  __builtin_prefetch (ptr, 0, 0, 1);
>  }
>
>  void
>  glob_idx (void)
>  {
> -  __builtin_prefetch (&ptr[idx], 0, 0);
> +  __builtin_prefetch (&ptr[idx], 0, 0, 1);
>  }
>
>  int
>  main ()
>  {
> -  __builtin_prefetch (&s.b, 0, 0);
> -  __builtin_prefetch (&s.c[1], 0, 0);
> +  __builtin_prefetch (&s.b, 0, 0, 1);
> +  __builtin_prefetch (&s.c[1], 0, 0, 1);
>
>    arg_ptr (&s.c[1]);
>    arg_ptr (ptr+3);
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
> index f643c5c7286..fabecaf56dc 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
> @@ -25,7 +25,7 @@ prefetch_for_read (void)
>  {
>    int i;
>    for (i = 0; i < ARRSIZE; i++)
> -    __builtin_prefetch (bad_addr[i], 0, 0);
> +    __builtin_prefetch (bad_addr[i], 0, 0, 1);
>  }
>
>  void
> @@ -33,7 +33,7 @@ prefetch_for_write (void)
>  {
>    int i;
>    for (i = 0; i < ARRSIZE; i++)
> -    __builtin_prefetch (bad_addr[i], 1, 0);
> +    __builtin_prefetch (bad_addr[i], 1, 0, 1);
>  }
>
>  int
> diff --git a/gcc/testsuite/gcc.dg/builtin-prefetch-1.c b/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
> index 11beb4e1bbe..84d564dc72c 100644
> --- a/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
> +++ b/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
> @@ -1,6 +1,6 @@
>  /* Test that __builtin_prefetch does no harm.
>
> -   Prefetch using some invalid rw and locality values.  These must be
> +   Prefetch using some invalid cache, rw and locality values.  These must be
>     compile-time constants.  */
>
>  /* { dg-do run } */
> @@ -9,6 +9,7 @@ extern void exit (int);
>
>  enum locality { none, low, moderate, high, bogus };
>  enum rw { read, write };
> +enum cache { inst, data };
>
>  int arr[10];
>
> @@ -34,6 +35,8 @@ bad (int *p)
>    __builtin_prefetch (p, 0, -1);  /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
>    __builtin_prefetch (p, 0, 4);   /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
>    __builtin_prefetch (p, 0, bogus);   /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
> +  __builtin_prefetch (p, 0, 3, -1);   /* { dg-warning "invalid fourth argument to '__builtin_prefetch'; using one" } */
> +  __builtin_prefetch (p, 0, 3, bogus);   /* { dg-warning "invalid fourth argument to '__builtin_prefetch'; using one" } */
>  }
>
>  int
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
> index 638749a5a68..eb9197b357c 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
> @@ -9,14 +9,14 @@ char *msg = "howdy there";
>
>  void foo (char *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>  }
>
>  int main ()
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
> index d793437f175..b5081815f7a 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
> @@ -10,14 +10,14 @@ char *msg = "howdy there";
>
>  void foo (char *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>  }
>
>  int main ()
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
> index 04e814d5a9c..2317f665107 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
> @@ -9,14 +9,14 @@ char *msg = "howdy there";
>
>  void foo (char *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>  }
>
>  int main ()
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
> index 3707c7074be..936ad9e79ad 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
> @@ -9,14 +9,14 @@ char *msg = "howdy there";
>
>  void foo (char *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>  }
>
>  int main ()
> diff --git a/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c b/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
> new file mode 100644
> index 00000000000..f082396ac2e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/alpha/prefetchi-1.c b/gcc/testsuite/gcc.target/alpha/prefetchi-1.c
> new file mode 100644
> index 00000000000..5d9c387e260
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/alpha/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mcpu=ev6" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/arc/prefetchi-1.c b/gcc/testsuite/gcc.target/arc/prefetchi-1.c
> new file mode 100644
> index 00000000000..7e023ab6498
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arc/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mcpu=archs" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/arm/prefetchi-1.c b/gcc/testsuite/gcc.target/arm/prefetchi-1.c
> new file mode 100644
> index 00000000000..0fbcb7019bc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { ia32 } } } */
> +/* { dg-options "-O2 -march=armv5te" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/hppa/prefetchi-1.c b/gcc/testsuite/gcc.target/hppa/prefetchi-1.c
> new file mode 100644
> index 00000000000..26854a6828d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/hppa/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mpa-risc-2-0" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
> index 051a1b59b5b..ea0b9f6bcef 100644
> --- a/gcc/testsuite/gcc.target/i386/avx-1.c
> +++ b/gcc/testsuite/gcc.target/i386/avx-1.c
> @@ -153,7 +153,7 @@
>  #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
>
>  /* xmmintrin.h */
> -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
> +#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
>  #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
>  #define __builtin_ia32_vec_set_v4hi(A, D, N) \
>    __builtin_ia32_vec_set_v4hi(A, D, 0)
> diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-1.c b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
> new file mode 100644
> index 00000000000..b32d59f2e5f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad(const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
> index ca662f7bd47..6c9742cf494 100644
> --- a/gcc/testsuite/gcc.target/i386/sse-13.c
> +++ b/gcc/testsuite/gcc.target/i386/sse-13.c
> @@ -125,7 +125,7 @@
>  #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
>
>  /* xmmintrin.h */
> -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
> +#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
>  #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
>  #define __builtin_ia32_vec_set_v4hi(A, D, N) \
>    __builtin_ia32_vec_set_v4hi(A, D, 0)
> diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
> index ba1310f9f89..344913e9a90 100644
> --- a/gcc/testsuite/gcc.target/i386/sse-23.c
> +++ b/gcc/testsuite/gcc.target/i386/sse-23.c
> @@ -94,7 +94,7 @@
>  #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
>
>  /* xmmintrin.h */
> -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
> +#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
>  #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
>  #define __builtin_ia32_vec_set_v4hi(A, D, N) \
>    __builtin_ia32_vec_set_v4hi(A, D, 0)
> diff --git a/gcc/testsuite/gcc.target/ia64/prefetchi-1.c b/gcc/testsuite/gcc.target/ia64/prefetchi-1.c
> new file mode 100644
> index 00000000000..f082396ac2e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/ia64/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/mips/prefetchi-1.c b/gcc/testsuite/gcc.target/mips/prefetchi-1.c
> new file mode 100644
> index 00000000000..23e78a0c7ba
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/mips/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mips4 -mexplicit-relocs" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c b/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
> new file mode 100644
> index 00000000000..f082396ac2e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/s390/prefetchi-1.c b/gcc/testsuite/gcc.target/s390/prefetchi-1.c
> new file mode 100644
> index 00000000000..5ef557f1d8c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mzarch -march=z10" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/sh/prefetchi-1.c b/gcc/testsuite/gcc.target/sh/prefetchi-1.c
> new file mode 100644
> index 00000000000..347bdea8df8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/sh/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { has_pref } } } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/sparc/prefetchi-1.c b/gcc/testsuite/gcc.target/sparc/prefetchi-1.c
> new file mode 100644
> index 00000000000..1bd7ad495e2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/sparc/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mcpu=v9" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> --
> 2.18.1
>


-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-14  8:34 ` [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM Haochen Jiang
  2022-10-14  8:46   ` Hongtao Liu
@ 2022-10-17 15:28   ` Richard Earnshaw
  2022-10-18  9:20     ` [PATCH v2] " Haochen Jiang
  2022-10-19 17:14   ` [PATCH 1/2] " Andrew Pinski
  2022-10-19 21:06   ` Segher Boessenkool
  3 siblings, 1 reply; 26+ messages in thread
From: Richard Earnshaw @ 2022-10-17 15:28 UTC (permalink / raw)
  To: Haochen Jiang, gcc-patches
  Cc: aoliva, richard.sandiford, uweigand, linkw, gnu, dje.gcc,
	olegendo, claziss, segher, mfortune, davem, dave.anglin, hubicka,
	richard.earnshaw, rguenther, marcus.shawcroft,
	ramana.radhakrishnan, hongtao.liu



On 14/10/2022 09:34, Haochen Jiang via Gcc-patches wrote:
> gcc/ChangeLog:
> 
> 	* builtins.cc (expand_builtin_prefetch): Handle the fourth parameter in
> 	expand function.
> 	* config/aarch64/aarch64-sve.md: Add default parameter value.
> 	* config/aarch64/aarch64.md (prefetch): New define_expand.
> 	(*prefetch): Add default parameter value.
> 	* config/alpha/alpha.md (prefetch): New define_expand.
> 	(*prefetch): Add default parameter value.
> 	* config/arc/arc.md: Add default parameter value.
> 	* config/arm/arm.md (prefetch): New define_expand.
> 	(*prefetch): Add default parameter value.
> 	* config/frv/frv.md: Ditto.
> 	* config/i386/i386.md: Ditto.
> 	* config/ia64/ia64.md (prefetch): New define_expand.
> 	(*prefetch): Add default parameter value.
> 	* config/mips/mips.md (prefetch): New define_expand.
> 	(*prefetch): Add default parameter value.
> 	* config/pa/pa.md: Ditto.
> 	* config/rs6000/rs6000.md (prefetch): New define_expand.
> 	(*prefetch): Add default parameter value.
> 	* config/s390/s390.cc (s390_expand_cpymem): Generate fourth parameter for
> 	gen_prefetch call.
> 	(s390_expand_setmem): Ditto.
> 	(s390_expand_cmpmem): Ditto.
> 	* config/s390/s390.md (prefetch): New define_expand.
> 	(*prefetch): Add default parameter value.
> 	* config/sh/sh.md: Ditto.
> 	* config/sparc/sparc.md: Ditto.
> 	* doc/rtl.texi: Document cache variable for prefetch.
> 	* rtl.def (PREFETCH): Change prefetch DEF_RTL_EXPR to add fourth parameter.
> 	* rtlanal.cc (setup_reg_subrtx_bounds): Change gcc_checking_assert for
> 	fourth parameter.
> 	* target-insns.def (prefetch): Add fourth rtx for prefetch.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.c-torture/execute/builtin-prefetch-1.c: Add fourth parameter for
> 	testcases.
> 	* gcc.c-torture/execute/builtin-prefetch-2.c: Ditto.
> 	* gcc.c-torture/execute/builtin-prefetch-3.c: Ditto.
> 	* gcc.c-torture/execute/builtin-prefetch-4.c: Ditto.
> 	* gcc.c-torture/execute/builtin-prefetch-5.c: Ditto.
> 	* gcc.c-torture/execute/builtin-prefetch-6.c: Ditto.
> 	* gcc.dg/builtin-prefetch-1.c: Ditto.
> 	* gcc.misc-tests/i386-pf-3dnow-1.c: Ditto.
> 	* gcc.misc-tests/i386-pf-athlon-1.c: Ditto.
> 	* gcc.misc-tests/i386-pf-none-1.c: Ditto.
> 	* gcc.misc-tests/i386-pf-sse-1.c: Ditto.
> 	* gcc.target/i386/avx-1.c: Change prefetch macro define to variable args.
> 	* gcc.target/i386/sse-13.c: Ditto.
> 	* gcc.target/i386/sse-23.c: Ditto.
> 	* gcc.target/aarch64/prefetchi-1.c: New test.
> 	* gcc.target/alpha/prefetchi-1.c: Ditto.
> 	* gcc.target/arc/prefetchi-1.c: Ditto.
> 	* gcc.target/arm/prefetchi-1.c: Ditto.
> 	* gcc.target/hppa/prefetchi-1.c: Ditto.
> 	* gcc.target/i386/prefetchi-1.c: Ditto.
> 	* gcc.target/ia64/prefetchi-1.c: Ditto.
> 	* gcc.target/mips/prefetchi-1.c: Ditto.
> 	* gcc.target/powerpc/prefetchi-1.c: Ditto.
> 	* gcc.target/s390/prefetchi-1.c: Ditto.
> 	* gcc.target/sh/prefetchi-1.c: Ditto.
> 	* gcc.target/sparc/prefetchi-1.c: Ditto.
> ---
>   gcc/builtins.cc                               |  34 ++++--
>   gcc/config/aarch64/aarch64-sve.md             |  15 ++-
>   gcc/config/aarch64/aarch64.md                 |  19 +++-
>   gcc/config/alpha/alpha.md                     |  19 +++-
>   gcc/config/arc/arc.md                         |  20 +++-
>   gcc/config/arm/arm.md                         |  19 +++-
>   gcc/config/frv/frv.md                         |   6 +-
>   gcc/config/i386/i386.md                       |  17 ++-
>   gcc/config/ia64/ia64.md                       |  19 +++-
>   gcc/config/mips/mips.md                       |  22 +++-
>   gcc/config/pa/pa.md                           |  12 +-
>   gcc/config/rs6000/rs6000.md                   |  19 +++-
>   gcc/config/s390/s390.cc                       |  10 +-
>   gcc/config/s390/s390.md                       |  19 +++-
>   gcc/config/sh/sh.md                           |  15 ++-
>   gcc/config/sparc/sparc.md                     |  15 ++-
>   gcc/doc/rtl.texi                              |   6 +-
>   gcc/rtl.def                                   |   5 +-
>   gcc/rtlanal.cc                                |   2 +-
>   gcc/target-insns.def                          |   2 +-
>   .../execute/builtin-prefetch-1.c              |  45 ++++----
>   .../execute/builtin-prefetch-2.c              | 106 +++++++++---------
>   .../execute/builtin-prefetch-3.c              |  92 +++++++--------
>   .../execute/builtin-prefetch-4.c              |  44 ++++----
>   .../execute/builtin-prefetch-5.c              |  12 +-
>   .../execute/builtin-prefetch-6.c              |   4 +-
>   gcc/testsuite/gcc.dg/builtin-prefetch-1.c     |   5 +-
>   .../gcc.misc-tests/i386-pf-3dnow-1.c          |  16 +--
>   .../gcc.misc-tests/i386-pf-athlon-1.c         |  16 +--
>   gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c |  16 +--
>   gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c  |  16 +--
>   .../gcc.target/aarch64/prefetchi-1.c          |  11 ++
>   gcc/testsuite/gcc.target/alpha/prefetchi-1.c  |  11 ++
>   gcc/testsuite/gcc.target/arc/prefetchi-1.c    |  11 ++
>   gcc/testsuite/gcc.target/arm/prefetchi-1.c    |  11 ++
>   gcc/testsuite/gcc.target/hppa/prefetchi-1.c   |  11 ++
>   gcc/testsuite/gcc.target/i386/avx-1.c         |   2 +-
>   gcc/testsuite/gcc.target/i386/prefetchi-1.c   |  11 ++
>   gcc/testsuite/gcc.target/i386/sse-13.c        |   2 +-
>   gcc/testsuite/gcc.target/i386/sse-23.c        |   2 +-
>   gcc/testsuite/gcc.target/ia64/prefetchi-1.c   |  11 ++
>   gcc/testsuite/gcc.target/mips/prefetchi-1.c   |  11 ++
>   .../gcc.target/powerpc/prefetchi-1.c          |  11 ++
>   gcc/testsuite/gcc.target/s390/prefetchi-1.c   |  11 ++
>   gcc/testsuite/gcc.target/sh/prefetchi-1.c     |  11 ++
>   gcc/testsuite/gcc.target/sparc/prefetchi-1.c  |  11 ++
>   46 files changed, 564 insertions(+), 241 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/alpha/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/arc/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/arm/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/hppa/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/ia64/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/mips/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/s390/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/sh/prefetchi-1.c
>   create mode 100644 gcc/testsuite/gcc.target/sparc/prefetchi-1.c
> 
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index 5f319b28030..2e6d0c76beb 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -1282,18 +1282,18 @@ expand_builtin_update_setjmp_buf (rtx buf_addr)
>   static void
>   expand_builtin_prefetch (tree exp)
>   {
> -  tree arg0, arg1, arg2;
> +  tree arg0, arg1, arg2, arg3;
>     int nargs;
> -  rtx op0, op1, op2;
> +  rtx op0, op1, op2, op3;
>   
>     if (!validate_arglist (exp, POINTER_TYPE, 0))
>       return;
>   
>     arg0 = CALL_EXPR_ARG (exp, 0);
>   
> -  /* Arguments 1 and 2 are optional; argument 1 (read/write) defaults to
> -     zero (read) and argument 2 (locality) defaults to 3 (high degree of
> -     locality).  */
> +  /* Arguments 1, 2, 3 are optional; argument 1 (read/write) defaults to
> +     zero (read); argument 2 (locality) defaults to 3 (high degree of
> +     locality); argument 3 (cache type) defaults to 1 (data).  */
>     nargs = call_expr_nargs (exp);
>     if (nargs > 1)
>       arg1 = CALL_EXPR_ARG (exp, 1);
> @@ -1303,6 +1303,10 @@ expand_builtin_prefetch (tree exp)
>       arg2 = CALL_EXPR_ARG (exp, 2);
>     else
>       arg2 = integer_three_node;
> +  if (nargs > 3)
> +    arg3 = CALL_EXPR_ARG (exp, 3);
> +  else
> +    arg3 = integer_one_node;
>   
>     /* Argument 0 is an address.  */
>     op0 = expand_expr (arg0, NULL_RTX, Pmode, EXPAND_NORMAL);
> @@ -1336,14 +1340,30 @@ expand_builtin_prefetch (tree exp)
>         op2 = const0_rtx;
>       }
>   
> +  /* Argument 3 (cache type) must be a compile-time constant int.  */
> +  if (TREE_CODE (arg3) != INTEGER_CST)
> +    {
> +      error ("fourth argument to %<__builtin_prefetch%> must be a constant");
> +      arg3 = integer_one_node;
> +    }
> +  op3 = expand_normal (arg3);
> +  /* Argument 3 must be either zero or one.  */
> +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> +    {
> +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> +	" using one");
> +      op3 = const1_rtx;
> +    }
> +
>     if (targetm.have_prefetch ())
>       {
> -      class expand_operand ops[3];
> +      class expand_operand ops[4];
>   
>         create_address_operand (&ops[0], op0);
>         create_integer_operand (&ops[1], INTVAL (op1));
>         create_integer_operand (&ops[2], INTVAL (op2));
> -      if (maybe_expand_insn (targetm.code_for_prefetch, 3, ops))
> +      create_integer_operand (&ops[3], INTVAL (op3));
> +      if (maybe_expand_insn (targetm.code_for_prefetch, 4, ops))
>   	return;
>       }
>   
> diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md
> index e08bee197d8..0cde862bc04 100644
> --- a/gcc/config/aarch64/aarch64-sve.md
> +++ b/gcc/config/aarch64/aarch64-sve.md
> @@ -1944,7 +1944,8 @@
>   		(match_operand:DI 2 "const_int_operand")]
>   	       UNSPEC_SVE_PREFETCH)
>   	     (match_operand:DI 3 "const_int_operand")
> -	     (match_operand:DI 4 "const_int_operand"))]
> +	     (match_operand:DI 4 "const_int_operand")
> +	     (const_int 1))]
>     "TARGET_SVE"
>     {
>       operands[1] = gen_rtx_MEM (<MODE>mode, operands[1]);
> @@ -1984,7 +1985,8 @@
>   		(match_operand:DI 6 "const_int_operand")]
>   	       UNSPEC_SVE_PREFETCH_GATHER)
>   	     (match_operand:DI 7 "const_int_operand")
> -	     (match_operand:DI 8 "const_int_operand"))]
> +	     (match_operand:DI 8 "const_int_operand")
> +	     (const_int 1))]
>     "TARGET_SVE"
>     {
>       static const char *const insns[][2] = {
> @@ -2013,7 +2015,8 @@
>   		(match_operand:DI 6 "const_int_operand")]
>   	       UNSPEC_SVE_PREFETCH_GATHER)
>   	     (match_operand:DI 7 "const_int_operand")
> -	     (match_operand:DI 8 "const_int_operand"))]
> +	     (match_operand:DI 8 "const_int_operand")
> +	     (const_int 1))]
>     "TARGET_SVE"
>     {
>       static const char *const insns[][2] = {
> @@ -2044,7 +2047,8 @@
>   		(match_operand:DI 6 "const_int_operand")]
>   	       UNSPEC_SVE_PREFETCH_GATHER)
>   	     (match_operand:DI 7 "const_int_operand")
> -	     (match_operand:DI 8 "const_int_operand"))]
> +	     (match_operand:DI 8 "const_int_operand")
> +	     (const_int 1))]
>     "TARGET_SVE"
>     {
>       static const char *const insns[][2] = {
> @@ -2074,7 +2078,8 @@
>   		(match_operand:DI 6 "const_int_operand")]
>   	       UNSPEC_SVE_PREFETCH_GATHER)
>   	     (match_operand:DI 7 "const_int_operand")
> -	     (match_operand:DI 8 "const_int_operand"))]
> +	     (match_operand:DI 8 "const_int_operand")
> +	     (const_int 1))]
>     "TARGET_SVE"
>     {
>       static const char *const insns[][2] = {
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index f2e3d905dbb..94fa6b4200c 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -818,10 +818,25 @@
>     [(set_attr "type" "no_insn")]
>   )
>   
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:DI 0 "aarch64_prefetch_operand")
> +            (match_operand:QI 1 "const_int_operand")
> +            (match_operand:QI 2 "const_int_operand")
> +	    (match_operand:QI 3 "const_int_operand"))]
> +  ""
> +  {
> +    if (INTVAL (operands[3]) == 0)
> +    {
> +      warning (0, "instruction prefetch is not supported; using data prefetch");
> +      operands[3] = const1_rtx;
> +    }
> +  })
> +

Both Arm and AArch64 have instruction prefetch operations, so the 
warning should say "not yet implemented", rather than "not supported".

R.

> +(define_insn "*prefetch"
>     [(prefetch (match_operand:DI 0 "aarch64_prefetch_operand" "Dp")
>               (match_operand:QI 1 "const_int_operand" "")
> -            (match_operand:QI 2 "const_int_operand" ""))]
> +            (match_operand:QI 2 "const_int_operand" "")
> +	    (const_int 1))]
>     ""
>     {
>       const char * pftype[2][4] =
> diff --git a/gcc/config/alpha/alpha.md b/gcc/config/alpha/alpha.md
> index 87514330c22..46fd6a7b7cb 100644
> --- a/gcc/config/alpha/alpha.md
> +++ b/gcc/config/alpha/alpha.md
> @@ -5176,10 +5176,25 @@
>   ;;
>   ;; On EV6, these become official prefetch instructions.
>   
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:DI 0 "address_operand")
> +	     (match_operand:DI 1 "const_int_operand")
> +	     (match_operand:DI 2 "const_int_operand")
> +	     (match_operand:DI 3 "const_int_operand"))]
> +  "TARGET_FIXUP_EV5_PREFETCH || alpha_cpu == PROCESSOR_EV6"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>     [(prefetch (match_operand:DI 0 "address_operand" "p")
>   	     (match_operand:DI 1 "const_int_operand" "n")
> -	     (match_operand:DI 2 "const_int_operand" "n"))]
> +	     (match_operand:DI 2 "const_int_operand" "n")
> +	     (const_int 1))]
>     "TARGET_FIXUP_EV5_PREFETCH || alpha_cpu == PROCESSOR_EV6"
>   {
>     /* Interpret "no temporal locality" as this data should be evicted once
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index 458d3edf716..9607a0dd572 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -5255,14 +5255,22 @@ archs4x, archs4xd"
>   (define_expand "prefetch"
>     [(prefetch (match_operand:SI 0 "address_operand" "")
>   	     (match_operand:SI 1 "const_int_operand" "")
> -	     (match_operand:SI 2 "const_int_operand" ""))]
> +	     (match_operand:SI 2 "const_int_operand" "")
> +	     (match_operand:SI 3 "const_int_operand" ""))]
>     "TARGET_HS"
> -  "")
> +  {
> +    if (INTVAL (operands[3]) == 0)
> +    {
> +      warning (0, "instruction prefetch is not supported; using data prefetch");
> +      operands[3] = const1_rtx;
> +    }
> +  })
>   
>   (define_insn "prefetch_1"
>     [(prefetch (match_operand:SI 0 "register_operand" "r")
>   	     (match_operand:SI 1 "const_int_operand" "n")
> -	     (match_operand:SI 2 "const_int_operand" "n"))]
> +	     (match_operand:SI 2 "const_int_operand" "n")
> +	     (const_int 1))]
>     "TARGET_HS"
>     {
>      if (INTVAL (operands[1]))
> @@ -5277,7 +5285,8 @@ archs4x, archs4xd"
>     [(prefetch (plus:SI (match_operand:SI 0 "register_operand" "r,r,r")
>   		      (match_operand:SI 1 "nonmemory_operand" "r,Cm2,Cal"))
>   	     (match_operand:SI 2 "const_int_operand" "n,n,n")
> -	     (match_operand:SI 3 "const_int_operand" "n,n,n"))]
> +	     (match_operand:SI 3 "const_int_operand" "n,n,n")
> +	     (const_int 1))]
>     "TARGET_HS"
>     {
>      if (INTVAL (operands[2]))
> @@ -5291,7 +5300,8 @@ archs4x, archs4xd"
>   (define_insn "prefetch_3"
>     [(prefetch (match_operand:SI 0 "address_operand" "p")
>   	     (match_operand:SI 1 "const_int_operand" "n")
> -	     (match_operand:SI 2 "const_int_operand" "n"))]
> +	     (match_operand:SI 2 "const_int_operand" "n")
> +	     (const_int 1))]
>     "TARGET_HS"
>     {
>      operands[0] = gen_rtx_MEM (SImode, operands[0]);
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 69bf343fb0e..7f2ec97406f 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -12206,10 +12206,25 @@
>   
>   ;; V5E instructions.
>   
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:SI 0 "address_operand")
> +	     (match_operand:SI 1 "")
> +	     (match_operand:SI 2 "")
> +	     (match_operand:SI 3 ""))]
> +  "TARGET_32BIT && arm_arch5te"
> +  {
> +    if (INTVAL (operands[3]) == 0)
> +    {
> +      warning (0, "instruction prefetch is not supported; using data prefetch");
> +      operands[3] = const1_rtx;
> +    }
> +  })
> +
> +(define_insn "*prefetch"
>     [(prefetch (match_operand:SI 0 "address_operand" "p")
>   	     (match_operand:SI 1 "" "")
> -	     (match_operand:SI 2 "" ""))]
> +	     (match_operand:SI 2 "" "")
> +	     (const_int 1))]
>     "TARGET_32BIT && arm_arch5te"
>     "pld\\t%a0"
>     [(set_attr "type" "load_4")]
> diff --git a/gcc/config/frv/frv.md b/gcc/config/frv/frv.md
> index 6258fe3b99e..2fb9de593c9 100644
> --- a/gcc/config/frv/frv.md
> +++ b/gcc/config/frv/frv.md
> @@ -7631,7 +7631,8 @@
>     [(prefetch (unspec:SI [(match_operand:SI 0 "register_operand" "r")]
>   			UNSPEC_PREFETCH0)
>   	     (const_int 0)
> -	     (const_int 0))]
> +	     (const_int 0)
> +	     (const_int 1))]
>     ""
>     "dcpl %0, gr0, #0"
>     [(set_attr "length" "4")])
> @@ -7640,7 +7641,8 @@
>     [(prefetch (unspec:SI [(match_operand:SI 0 "register_operand" "r")]
>   			UNSPEC_PREFETCH)
>   	     (const_int 0)
> -	     (const_int 0))]
> +	     (const_int 0)
> +	     (const_int 1))]
>     "TARGET_FR500_FR550_BUILTINS"
>     "nop.p\\n\\tnldub @(%0, gr0), gr0"
>     [(set_attr "length" "8")])
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 8e847520491..c65cf14b9f4 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -23635,9 +23635,15 @@
>   (define_expand "prefetch"
>     [(prefetch (match_operand 0 "address_operand")
>   	     (match_operand:SI 1 "const_int_operand")
> -	     (match_operand:SI 2 "const_int_operand"))]
> +	     (match_operand:SI 2 "const_int_operand")
> +	     (match_operand:SI 3 "const_int_operand"))]
>     "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1"
>   {
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
>     bool write = operands[1] != const0_rtx;
>     int locality = INTVAL (operands[2]);
>   
> @@ -23679,7 +23685,8 @@
>   (define_insn "*prefetch_sse"
>     [(prefetch (match_operand 0 "address_operand" "p")
>   	     (const_int 0)
> -	     (match_operand:SI 1 "const_int_operand"))]
> +	     (match_operand:SI 1 "const_int_operand")
> +	     (const_int 1))]
>     "TARGET_PREFETCH_SSE"
>   {
>     static const char * const patterns[4] = {
> @@ -23700,7 +23707,8 @@
>   (define_insn "*prefetch_3dnow"
>     [(prefetch (match_operand 0 "address_operand" "p")
>   	     (match_operand:SI 1 "const_int_operand")
> -	     (const_int 3))]
> +	     (const_int 3)
> +	     (const_int 1))]
>     "TARGET_3DNOW || TARGET_PRFCHW || TARGET_PREFETCHWT1"
>   {
>     if (operands[1] == const0_rtx)
> @@ -23716,7 +23724,8 @@
>   (define_insn "*prefetch_prefetchwt1"
>     [(prefetch (match_operand 0 "address_operand" "p")
>   	     (const_int 1)
> -	     (const_int 2))]
> +	     (const_int 2)
> +	     (const_int 1))]
>     "TARGET_PREFETCHWT1"
>     "prefetchwt1\t%a0";
>     [(set_attr "type" "sse")
> diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
> index 5d1d47da55b..9fbbea3412a 100644
> --- a/gcc/config/ia64/ia64.md
> +++ b/gcc/config/ia64/ia64.md
> @@ -5018,10 +5018,25 @@
>     "break.f 0"
>     [(set_attr "itanium_class" "nop_f")])
>   
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:DI 0 "address_operand")
> +	     (match_operand:DI 1 "const_int_operand")
> +	     (match_operand:DI 2 "const_int_operand")
> +	     (match_operand:DI 3 "const_int_operand"))]
> +  ""
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>     [(prefetch (match_operand:DI 0 "address_operand" "p")
>   	     (match_operand:DI 1 "const_int_operand" "n")
> -	     (match_operand:DI 2 "const_int_operand" "n"))]
> +	     (match_operand:DI 2 "const_int_operand" "n")
> +	     (const_int 1))]
>     ""
>   {
>     static const char * const alt[2][4] = {
> diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
> index e0f0a582732..b5c547806b4 100644
> --- a/gcc/config/mips/mips.md
> +++ b/gcc/config/mips/mips.md
> @@ -7227,10 +7227,25 @@
>   ;;
>   
>   
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:QI 0 "address_operand")
> +	     (match_operand 1 "const_int_operand")
> +	     (match_operand 2 "const_int_operand")
> +	     (match_operand 3 "const_int_operand"))]
> +  "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>     [(prefetch (match_operand:QI 0 "address_operand" "ZD")
>   	     (match_operand 1 "const_int_operand" "n")
> -	     (match_operand 2 "const_int_operand" "n"))]
> +	     (match_operand 2 "const_int_operand" "n")
> +	     (const_int 1))]
>     "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
>   {
>     if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT)
> @@ -7257,7 +7272,8 @@
>     [(prefetch (plus:P (match_operand:P 0 "register_operand" "d")
>   		     (match_operand:P 1 "register_operand" "d"))
>   	     (match_operand 2 "const_int_operand" "n")
> -	     (match_operand 3 "const_int_operand" "n"))]
> +	     (match_operand 3 "const_int_operand" "n")
> +	     (const_int 1))]
>     "ISA_HAS_PREFETCHX && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT"
>   {
>     if (TARGET_LOONGSON_EXT)
> diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
> index 76ae35d4cfa..a7469074c01 100644
> --- a/gcc/config/pa/pa.md
> +++ b/gcc/config/pa/pa.md
> @@ -10201,9 +10201,16 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
>   (define_expand "prefetch"
>     [(match_operand 0 "address_operand" "")
>      (match_operand 1 "const_int_operand" "")
> -   (match_operand 2 "const_int_operand" "")]
> +   (match_operand 2 "const_int_operand" "")
> +   (match_operand 3 "const_int_operand" "")]
>     "TARGET_PA_20"
>   {
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +
>     operands[0] = copy_addr_to_reg (operands[0]);
>     emit_insn (gen_prefetch_20 (operands[0], operands[1], operands[2]));
>     DONE;
> @@ -10212,7 +10219,8 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
>   (define_insn "prefetch_20"
>     [(prefetch (match_operand 0 "pmode_register_operand" "r")
>   	     (match_operand:SI 1 "const_int_operand" "n")
> -	     (match_operand:SI 2 "const_int_operand" "n"))]
> +	     (match_operand:SI 2 "const_int_operand" "n")
> +	     (const_int 1))]
>     "TARGET_PA_20"
>   {
>     /* The SL cache-control completer indicates good spatial locality but
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index ad5a4cf2ef8..21ff09eca93 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -14060,10 +14060,25 @@
>     DONE;
>   })
>   
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand 0 "indexed_or_indirect_address")
> +	     (match_operand:SI 1 "const_int_operand")
> +	     (match_operand:SI 2 "const_int_operand")
> +	     (match_operand:SI 3 "const_int_operand"))]
> +  ""
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>     [(prefetch (match_operand 0 "indexed_or_indirect_address" "a")
>   	     (match_operand:SI 1 "const_int_operand" "n")
> -	     (match_operand:SI 2 "const_int_operand" "n"))]
> +	     (match_operand:SI 2 "const_int_operand" "n")
> +	     (const_int 1))]
>     ""
>   {
>   
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index ae309471f04..3fc5ae196b8 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -5697,13 +5697,13 @@ s390_expand_cpymem (rtx dst, rtx src, rtx len)
>   
>   	  /* Issue a read prefetch for the +3 cache line.  */
>   	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, src_addr, GEN_INT (768)),
> -				   const0_rtx, const0_rtx);
> +				   const0_rtx, const0_rtx, const1_rtx);
>   	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>   	  emit_insn (prefetch);
>   
>   	  /* Issue a write prefetch for the +3 cache line.  */
>   	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, GEN_INT (768)),
> -				   const1_rtx, const0_rtx);
> +				   const1_rtx, const0_rtx, const1_rtx);
>   	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>   	  emit_insn (prefetch);
>   	}
> @@ -5872,7 +5872,7 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
>   	  /* Issue a write prefetch.  */
>   	  rtx distance = GEN_INT (TARGET_SETMEM_PREFETCH_DISTANCE);
>   	  rtx prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, distance),
> -				       const1_rtx, const0_rtx);
> +				       const1_rtx, const0_rtx, const1_rtx);
>   	  emit_insn (prefetch);
>   	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>   	}
> @@ -5999,13 +5999,13 @@ s390_expand_cmpmem (rtx target, rtx op0, rtx op1, rtx len)
>   
>   	  /* Issue a read prefetch for the +2 cache line of operand 1.  */
>   	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, addr0, GEN_INT (512)),
> -				   const0_rtx, const0_rtx);
> +				   const0_rtx, const0_rtx, const1_rtx);
>   	  emit_insn (prefetch);
>   	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>   
>   	  /* Issue a read prefetch for the +2 cache line of operand 2.  */
>   	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, addr1, GEN_INT (512)),
> -				   const0_rtx, const0_rtx);
> +				   const0_rtx, const0_rtx, const1_rtx);
>   	  emit_insn (prefetch);
>   	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>   	}
> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
> index 962927c3112..4b094aa2bcf 100644
> --- a/gcc/config/s390/s390.md
> +++ b/gcc/config/s390/s390.md
> @@ -11601,10 +11601,25 @@
>   ; Data prefetch patterns
>   ;
>   
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand 0    "address_operand")
> +	     (match_operand:SI 1 "const_int_operand")
> +	     (match_operand:SI 2 "const_int_operand")
> +             (match_operand:SI 3 "const_int_operand"))]
> +  "TARGET_Z10"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>     [(prefetch (match_operand 0    "address_operand"   "ZT,X")
>   	     (match_operand:SI 1 "const_int_operand" " n,n")
> -	     (match_operand:SI 2 "const_int_operand" " n,n"))]
> +	     (match_operand:SI 2 "const_int_operand" " n,n")
> +             (const_int 1))]
>     "TARGET_Z10"
>   {
>     switch (which_alternative)
> diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
> index 59a7b216433..54a8270e80e 100644
> --- a/gcc/config/sh/sh.md
> +++ b/gcc/config/sh/sh.md
> @@ -10928,13 +10928,22 @@
>   (define_expand "prefetch"
>     [(prefetch (match_operand 0 "address_operand" "")
>   	     (match_operand:SI 1 "const_int_operand" "")
> -	     (match_operand:SI 2 "const_int_operand" ""))]
> -  "(TARGET_SH2A || TARGET_SH3) && !TARGET_VXWORKS_RTP")
> +	     (match_operand:SI 2 "const_int_operand" "")
> +	     (match_operand:SI 3 "const_int_operand" ""))]
> +  "(TARGET_SH2A || TARGET_SH3) && !TARGET_VXWORKS_RTP"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
>   
>   (define_insn "*prefetch"
>     [(prefetch (match_operand:SI 0 "register_operand" "r")
>   	     (match_operand:SI 1 "const_int_operand" "n")
> -	     (match_operand:SI 2 "const_int_operand" "n"))]
> +	     (match_operand:SI 2 "const_int_operand" "n")
> +	     (const_int 1))]
>     "(TARGET_SH2A || TARGET_SH3) && ! TARGET_VXWORKS_RTP"
>     "pref	@%0"
>     [(set_attr "type" "other")])
> diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
> index 691e707863a..04cb6935b1b 100644
> --- a/gcc/config/sparc/sparc.md
> +++ b/gcc/config/sparc/sparc.md
> @@ -7816,9 +7816,16 @@ visl")
>   (define_expand "prefetch"
>     [(match_operand 0 "address_operand" "")
>      (match_operand 1 "const_int_operand" "")
> -   (match_operand 2 "const_int_operand" "")]
> +   (match_operand 2 "const_int_operand" "")
> +   (match_operand 3 "const_int_operand" "")]
>     "TARGET_V9"
>   {
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +
>     if (TARGET_ARCH64)
>       emit_insn (gen_prefetch_64 (operands[0], operands[1], operands[2]));
>     else
> @@ -7829,7 +7836,8 @@ visl")
>   (define_insn "prefetch_64"
>     [(prefetch (match_operand:DI 0 "address_operand" "p")
>   	     (match_operand:DI 1 "const_int_operand" "n")
> -	     (match_operand:DI 2 "const_int_operand" "n"))]
> +	     (match_operand:DI 2 "const_int_operand" "n")
> +	     (const_int 1))]
>     ""
>   {
>     static const char * const prefetch_instr[2][2] = {
> @@ -7855,7 +7863,8 @@ visl")
>   (define_insn "prefetch_32"
>     [(prefetch (match_operand:SI 0 "address_operand" "p")
>   	     (match_operand:SI 1 "const_int_operand" "n")
> -	     (match_operand:SI 2 "const_int_operand" "n"))]
> +	     (match_operand:SI 2 "const_int_operand" "n")
> +	     (const_int 1))]
>     ""
>   {
>     static const char * const prefetch_instr[2][2] = {
> diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
> index 43c9ee8bffe..592f4b0e4dd 100644
> --- a/gcc/doc/rtl.texi
> +++ b/gcc/doc/rtl.texi
> @@ -3454,7 +3454,7 @@ position of @var{base}, @var{min} and @var{max} to the containing insn
>   and of @var{min} and @var{max} to @var{base}.  See rtl.def for details.
>   
>   @findex prefetch
> -@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality})
> +@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality} @var{cache})
>   Represents prefetch of memory at address @var{addr}.
>   Operand @var{rw} is 1 if the prefetch is for data to be written, 0 otherwise;
>   targets that do not support write prefetches should treat this as a normal
> @@ -3462,6 +3462,10 @@ prefetch.
>   Operand @var{locality} specifies the amount of temporal locality; 0 if there
>   is none or 1, 2, or 3 for increasing levels of temporal locality;
>   targets that do not support locality hints should ignore this.
> +Operand @var{cache} is 1 if the prefetch is prefetching data, 0 for prefetching
> +instruction;
> +targets that do not support instruction prefetch should treat all as data
> +prefetch.
>   
>   This insn is used to minimize cache-miss latency by moving data into a
>   cache before it is accessed.  It should use only non-faulting data prefetch
> diff --git a/gcc/rtl.def b/gcc/rtl.def
> index 08e31fa3544..f2e37d55023 100644
> --- a/gcc/rtl.def
> +++ b/gcc/rtl.def
> @@ -277,10 +277,11 @@ DEF_RTL_EXPR(ADDR_DIFF_VEC, "addr_diff_vec", "eEee0", RTX_EXTRA)
>      Operand 3 is the level of temporal locality; 0 means there is no
>      temporal locality and 1, 2, and 3 are for increasing levels of temporal
>      locality.
> +   Operand 4 is 1 for prefetch data, 0 for prefetch instrction.
>   
> -   The attributes specified by operands 2 and 3 are ignored for targets
> +   The attributes specified by operands 2, 3 and 4 are ignored for targets
>      whose prefetch instructions do not support them.  */
> -DEF_RTL_EXPR(PREFETCH, "prefetch", "eee", RTX_EXTRA)
> +DEF_RTL_EXPR(PREFETCH, "prefetch", "eeee", RTX_EXTRA)
>   
>   /* ----------------------------------------------------------------------
>      At the top level of an instruction (perhaps under PARALLEL).
> diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
> index 56da7435a28..7eeef285f1e 100644
> --- a/gcc/rtlanal.cc
> +++ b/gcc/rtlanal.cc
> @@ -6196,7 +6196,7 @@ setup_reg_subrtx_bounds (unsigned int code)
>     while (format[i] == 'e');
>     rtx_all_subrtx_bounds[code].count = i - rtx_all_subrtx_bounds[code].start;
>     /* rtl-iter.h relies on this.  */
> -  gcc_checking_assert (rtx_all_subrtx_bounds[code].count <= 3);
> +  gcc_checking_assert (rtx_all_subrtx_bounds[code].count <= 4);
>   
>     for (; format[i]; ++i)
>       if (format[i] == 'E' || format[i] == 'V' || format[i] == 'e')
> diff --git a/gcc/target-insns.def b/gcc/target-insns.def
> index de8c0092f98..ca13d1c4393 100644
> --- a/gcc/target-insns.def
> +++ b/gcc/target-insns.def
> @@ -76,7 +76,7 @@ DEF_TARGET_INSN (omp_simt_ordered, (rtx x0, rtx x1))
>   DEF_TARGET_INSN (omp_simt_vote_any, (rtx x0, rtx x1))
>   DEF_TARGET_INSN (omp_simt_xchg_bfly, (rtx x0, rtx x1, rtx x2))
>   DEF_TARGET_INSN (omp_simt_xchg_idx, (rtx x0, rtx x1, rtx x2))
> -DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2))
> +DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2, rtx x3))
>   DEF_TARGET_INSN (probe_stack, (rtx x0))
>   DEF_TARGET_INSN (probe_stack_address, (rtx x0))
>   DEF_TARGET_INSN (prologue, (void))
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
> index 4ee05a94d9f..ccc5fab15e5 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
> @@ -1,57 +1,62 @@
>   /* Test that __builtin_prefetch does no harm.
>   
> -   Prefetch using all valid combinations of rw and locality values.
> +   Prefetch using all valid combinations of cache, rw and locality values.
>      These must be compile-time constants.  */
>   
>   #define NO_TEMPORAL_LOCALITY 0
>   #define LOW_TEMPORAL_LOCALITY 1
> -#define MODERATE_TEMPORAL_LOCALITY 1
> +#define MODERATE_TEMPORAL_LOCALITY 2
>   #define HIGH_TEMPORAL_LOCALITY 3
>   
>   #define WRITE_ACCESS 1
>   #define READ_ACCESS 0
>   
> +#define DATA_PRFCH 1
> +#define INST_PRFCH 0
> +
>   enum locality { none, low, moderate, high };
>   enum rw { read, write };
> +enum cache { inst, data };
>   
>   int arr[10];
>   
>   void
>   good_const (const int *p)
>   {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, READ_ACCESS, 3);
> -  __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY);
> -  __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY);
> -  __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY);
> -  __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, READ_ACCESS, 3, 1);
> +  __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY, 1);
> +  __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY, 1);
> +  __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY, 1);
> +  __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY, DATA_PRFCH);
>   }
>   
>   void
>   good_enum (const int *p)
>   {
> -    __builtin_prefetch (p, read, none);
> -    __builtin_prefetch (p, read, low);
> -    __builtin_prefetch (p, read, moderate);
> -    __builtin_prefetch (p, read, high);
> -    __builtin_prefetch (p, write, none);
> -    __builtin_prefetch (p, write, low);
> -    __builtin_prefetch (p, write, moderate);
> -    __builtin_prefetch (p, write, high);
> +    __builtin_prefetch (p, read, none, data);
> +    __builtin_prefetch (p, read, low, data);
> +    __builtin_prefetch (p, read, moderate, data);
> +    __builtin_prefetch (p, read, high, data);
> +    __builtin_prefetch (p, write, none, data);
> +    __builtin_prefetch (p, write, low, data);
> +    __builtin_prefetch (p, write, moderate, data);
> +    __builtin_prefetch (p, write, high, data);
>   }
>   
>   void
>   good_expr (const int *p)
>   {
> -  __builtin_prefetch (p, 1 - 1, 6 - (2 * 3));
> -  __builtin_prefetch (p, 1 + 0, 1 + 2);
> +  __builtin_prefetch (p, 1 - 1, 6 - (2 * 3), 1 + 0);
> +  __builtin_prefetch (p, 1 + 0, 1 + 2, 0 + 1);
>   }
>   
>   void
>   good_vararg (const int *p)
>   {
> +  __builtin_prefetch (p, 0, 3, 1);
>     __builtin_prefetch (p, 0, 3);
>     __builtin_prefetch (p, 0);
>     __builtin_prefetch (p, 1);
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
> index 530a1b0ef0d..6aff1f281e0 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
> @@ -26,9 +26,9 @@ struct S *ptr_str = &str;
>   void
>   simple_global ()
>   {
> -  __builtin_prefetch (glob_int_arr, 0, 0);
> -  __builtin_prefetch (glob_ptr_int, 0, 0);
> -  __builtin_prefetch (&glob_int, 0, 0);
> +  __builtin_prefetch (glob_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_int, 0, 0, 1);
>   }
>   
>   /* Prefetch file-level static variables using the address of the variable.  */
> @@ -36,9 +36,9 @@ simple_global ()
>   void
>   simple_file ()
>   {
> -  __builtin_prefetch (stat_int_arr, 0, 0);
> -  __builtin_prefetch (stat_ptr_int, 0, 0);
> -  __builtin_prefetch (&stat_int, 0, 0);
> +  __builtin_prefetch (stat_int_arr, 0, 0, 1);
> +  __builtin_prefetch (stat_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (&stat_int, 0, 0, 1);
>   }
>   
>   /* Prefetch local static variables using the address of the variable.  */
> @@ -49,9 +49,9 @@ simple_static_local ()
>     static int gx[100];
>     static int *hx = gx;
>     static int ix;
> -  __builtin_prefetch (gx, 0, 0);
> -  __builtin_prefetch (hx, 0, 0);
> -  __builtin_prefetch (&ix, 0, 0);
> +  __builtin_prefetch (gx, 0, 0, 1);
> +  __builtin_prefetch (hx, 0, 0, 1);
> +  __builtin_prefetch (&ix, 0, 0, 1);
>   }
>   
>   /* Prefetch local stack variables using the address of the variable.  */
> @@ -62,9 +62,9 @@ simple_local ()
>     int gx[100];
>     int *hx = gx;
>     int ix;
> -  __builtin_prefetch (gx, 0, 0);
> -  __builtin_prefetch (hx, 0, 0);
> -  __builtin_prefetch (&ix, 0, 0);
> +  __builtin_prefetch (gx, 0, 0, 1);
> +  __builtin_prefetch (hx, 0, 0, 1);
> +  __builtin_prefetch (&ix, 0, 0, 1);
>   }
>   
>   /* Prefetch arguments using the address of the variable.  */
> @@ -72,9 +72,9 @@ simple_local ()
>   void
>   simple_arg (int g[100], int *h, int i)
>   {
> -  __builtin_prefetch (g, 0, 0);
> -  __builtin_prefetch (h, 0, 0);
> -  __builtin_prefetch (&i, 0, 0);
> +  __builtin_prefetch (g, 0, 0, 1);
> +  __builtin_prefetch (h, 0, 0, 1);
> +  __builtin_prefetch (&i, 0, 0, 1);
>   }
>   
>   /* Prefetch using address expressions involving global variables.  */
> @@ -82,25 +82,25 @@ simple_arg (int g[100], int *h, int i)
>   void
>   expr_global (void)
>   {
> -  __builtin_prefetch (&str, 0, 0);
> -  __builtin_prefetch (ptr_str, 0, 0);
> -  __builtin_prefetch (&str.b, 0, 0);
> -  __builtin_prefetch (&ptr_str->b, 0, 0);
> -  __builtin_prefetch (&str.d, 0, 0);
> -  __builtin_prefetch (&ptr_str->d, 0, 0);
> -  __builtin_prefetch (str.next, 0, 0);
> -  __builtin_prefetch (ptr_str->next, 0, 0);
> -  __builtin_prefetch (str.next->d, 0, 0);
> -  __builtin_prefetch (ptr_str->next->d, 0, 0);
> -
> -  __builtin_prefetch (&glob_int_arr, 0, 0);
> -  __builtin_prefetch (glob_ptr_int, 0, 0);
> -  __builtin_prefetch (&glob_int_arr[2], 0, 0);
> -  __builtin_prefetch (&glob_ptr_int[3], 0, 0);
> -  __builtin_prefetch (glob_int_arr+3, 0, 0);
> -  __builtin_prefetch (glob_int_arr+glob_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_int+5, 0, 0);
> -  __builtin_prefetch (glob_ptr_int+glob_int, 0, 0);
> +  __builtin_prefetch (&str, 0, 0, 1);
> +  __builtin_prefetch (ptr_str, 0, 0, 1);
> +  __builtin_prefetch (&str.b, 0, 0, 1);
> +  __builtin_prefetch (&ptr_str->b, 0, 0, 1);
> +  __builtin_prefetch (&str.d, 0, 0, 1);
> +  __builtin_prefetch (&ptr_str->d, 0, 0, 1);
> +  __builtin_prefetch (str.next, 0, 0, 1);
> +  __builtin_prefetch (ptr_str->next, 0, 0, 1);
> +  __builtin_prefetch (str.next->d, 0, 0, 1);
> +  __builtin_prefetch (ptr_str->next->d, 0, 0, 1);
> +
> +  __builtin_prefetch (&glob_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_int_arr[2], 0, 0, 1);
> +  __builtin_prefetch (&glob_ptr_int[3], 0, 0, 1);
> +  __builtin_prefetch (glob_int_arr+3, 0, 0, 1);
> +  __builtin_prefetch (glob_int_arr+glob_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int+glob_int, 0, 0, 1);
>   }
>   
>   /* Prefetch using address expressions involving local variables.  */
> @@ -114,25 +114,25 @@ expr_local (void)
>     struct S *pt = &t;
>     int j = 4;
>   
> -  __builtin_prefetch (&t, 0, 0);
> -  __builtin_prefetch (pt, 0, 0);
> -  __builtin_prefetch (&t.b, 0, 0);
> -  __builtin_prefetch (&pt->b, 0, 0);
> -  __builtin_prefetch (&t.d, 0, 0);
> -  __builtin_prefetch (&pt->d, 0, 0);
> -  __builtin_prefetch (t.next, 0, 0);
> -  __builtin_prefetch (pt->next, 0, 0);
> -  __builtin_prefetch (t.next->d, 0, 0);
> -  __builtin_prefetch (pt->next->d, 0, 0);
> -
> -  __builtin_prefetch (&b, 0, 0);
> -  __builtin_prefetch (pb, 0, 0);
> -  __builtin_prefetch (&b[2], 0, 0);
> -  __builtin_prefetch (&pb[3], 0, 0);
> -  __builtin_prefetch (b+3, 0, 0);
> -  __builtin_prefetch (b+j, 0, 0);
> -  __builtin_prefetch (pb+5, 0, 0);
> -  __builtin_prefetch (pb+j, 0, 0);
> +  __builtin_prefetch (&t, 0, 0, 1);
> +  __builtin_prefetch (pt, 0, 0, 1);
> +  __builtin_prefetch (&t.b, 0, 0, 1);
> +  __builtin_prefetch (&pt->b, 0, 0, 1);
> +  __builtin_prefetch (&t.d, 0, 0, 1);
> +  __builtin_prefetch (&pt->d, 0, 0, 1);
> +  __builtin_prefetch (t.next, 0, 0, 1);
> +  __builtin_prefetch (pt->next, 0, 0, 1);
> +  __builtin_prefetch (t.next->d, 0, 0, 1);
> +  __builtin_prefetch (pt->next->d, 0, 0, 1);
> +
> +  __builtin_prefetch (&b, 0, 0, 1);
> +  __builtin_prefetch (pb, 0, 0, 1);
> +  __builtin_prefetch (&b[2], 0, 0, 1);
> +  __builtin_prefetch (&pb[3], 0, 0, 1);
> +  __builtin_prefetch (b+3, 0, 0, 1);
> +  __builtin_prefetch (b+j, 0, 0, 1);
> +  __builtin_prefetch (pb+5, 0, 0, 1);
> +  __builtin_prefetch (pb+j, 0, 0, 1);
>   }
>   
>   int
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
> index 2e2e808c172..38ce410384a 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
> @@ -36,11 +36,11 @@ volatile struct S * volatile vol_ptr_vol_str = &vol_str;
>   void
>   simple_vol_global ()
>   {
> -  __builtin_prefetch (glob_vol_int_arr, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (&glob_vol_int, 0, 0);
> +  __builtin_prefetch (glob_vol_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_int, 0, 0, 1);
>   }
>   
>   /* Prefetch volatile static variables using the address of the variable.  */
> @@ -48,11 +48,11 @@ simple_vol_global ()
>   void
>   simple_vol_file ()
>   {
> -  __builtin_prefetch (stat_vol_int_arr, 0, 0);
> -  __builtin_prefetch (stat_vol_ptr_int, 0, 0);
> -  __builtin_prefetch (stat_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (&stat_vol_int, 0, 0);
> +  __builtin_prefetch (stat_vol_int_arr, 0, 0, 1);
> +  __builtin_prefetch (stat_vol_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (stat_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (&stat_vol_int, 0, 0, 1);
>   }
>   
>   /* Prefetch using address expressions involving volatile global variables.  */
> @@ -60,43 +60,43 @@ simple_vol_file ()
>   void
>   expr_vol_global (void)
>   {
> -  __builtin_prefetch (&vol_str, 0, 0);
> -  __builtin_prefetch (ptr_vol_str, 0, 0);
> -  __builtin_prefetch (vol_ptr_str, 0, 0);
> -  __builtin_prefetch (vol_ptr_vol_str, 0, 0);
> -  __builtin_prefetch (&vol_str.b, 0, 0);
> -  __builtin_prefetch (&ptr_vol_str->b, 0, 0);
> -  __builtin_prefetch (&vol_ptr_str->b, 0, 0);
> -  __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0);
> -  __builtin_prefetch (&vol_str.d, 0, 0);
> -  __builtin_prefetch (&vol_ptr_str->d, 0, 0);
> -  __builtin_prefetch (&ptr_vol_str->d, 0, 0);
> -  __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0);
> -  __builtin_prefetch (vol_str.next, 0, 0);
> -  __builtin_prefetch (vol_ptr_str->next, 0, 0);
> -  __builtin_prefetch (ptr_vol_str->next, 0, 0);
> -  __builtin_prefetch (vol_ptr_vol_str->next, 0, 0);
> -  __builtin_prefetch (vol_str.next->d, 0, 0);
> -  __builtin_prefetch (vol_ptr_str->next->d, 0, 0);
> -  __builtin_prefetch (ptr_vol_str->next->d, 0, 0);
> -  __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0);
> +  __builtin_prefetch (&vol_str, 0, 0, 1);
> +  __builtin_prefetch (ptr_vol_str, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_str, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_vol_str, 0, 0, 1);
> +  __builtin_prefetch (&vol_str.b, 0, 0, 1);
> +  __builtin_prefetch (&ptr_vol_str->b, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_str->b, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0, 1);
> +  __builtin_prefetch (&vol_str.d, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_str->d, 0, 0, 1);
> +  __builtin_prefetch (&ptr_vol_str->d, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0, 1);
> +  __builtin_prefetch (vol_str.next, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_str->next, 0, 0, 1);
> +  __builtin_prefetch (ptr_vol_str->next, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_vol_str->next, 0, 0, 1);
> +  __builtin_prefetch (vol_str.next->d, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_str->next->d, 0, 0, 1);
> +  __builtin_prefetch (ptr_vol_str->next->d, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0, 1);
>   
> -  __builtin_prefetch (&glob_vol_int_arr, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (&glob_vol_int_arr[2], 0, 0);
> -  __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0);
> -  __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0);
> -  __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0);
> -  __builtin_prefetch (glob_vol_int_arr+3, 0, 0);
> -  __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int+5, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int+5, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0);
> +  __builtin_prefetch (&glob_vol_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_int_arr[2], 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0, 1);
> +  __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0, 1);
> +  __builtin_prefetch (glob_vol_int_arr+3, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0, 1);
>   }
>   
>   int
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
> index ade892b21a7..69b4cbe1854 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
> @@ -17,7 +17,7 @@ int
>   assign_arg_ptr (int *p)
>   {
>     int *q;
> -  __builtin_prefetch ((q = p), 0, 0);
> +  __builtin_prefetch ((q = p), 0, 0, 1);
>     return q == p;
>   }
>   
> @@ -25,7 +25,7 @@ int
>   assign_glob_ptr (void)
>   {
>     int *q;
> -  __builtin_prefetch ((q = ptr), 0, 0);
> +  __builtin_prefetch ((q = ptr), 0, 0, 1);
>     return q == ptr;
>   }
>   
> @@ -33,7 +33,7 @@ int
>   assign_arg_idx (int *p, int i)
>   {
>     int j;
> -  __builtin_prefetch (&p[j = i], 0, 0);
> +  __builtin_prefetch (&p[j = i], 0, 0, 1);
>     return j == i;
>   }
>   
> @@ -41,7 +41,7 @@ int
>   assign_glob_idx (void)
>   {
>     int j;
> -  __builtin_prefetch (&ptr[j = arrindex], 0, 0);
> +  __builtin_prefetch (&ptr[j = arrindex], 0, 0, 1);
>     return j == arrindex;
>   }
>   
> @@ -53,7 +53,7 @@ preinc_arg_ptr (int *p)
>   {
>     int *q;
>     q = p + 1;
> -  __builtin_prefetch (++p, 0, 0);
> +  __builtin_prefetch (++p, 0, 0, 1);
>     return p == q;
>   }
>   
> @@ -62,7 +62,7 @@ preinc_glob_ptr (void)
>   {
>     int *q;
>     q = ptr + 1;
> -  __builtin_prefetch (++ptr, 0, 0);
> +  __builtin_prefetch (++ptr, 0, 0, 1);
>     return ptr == q;
>   }
>   
> @@ -71,7 +71,7 @@ postinc_arg_ptr (int *p)
>   {
>     int *q;
>     q = p + 1;
> -  __builtin_prefetch (p++, 0, 0);
> +  __builtin_prefetch (p++, 0, 0, 1);
>     return p == q;
>   }
>   
> @@ -80,7 +80,7 @@ postinc_glob_ptr (void)
>   {
>     int *q;
>     q = ptr + 1;
> -  __builtin_prefetch (ptr++, 0, 0);
> +  __builtin_prefetch (ptr++, 0, 0, 1);
>     return ptr == q;
>   }
>   
> @@ -89,7 +89,7 @@ predec_arg_ptr (int *p)
>   {
>     int *q;
>     q = p - 1;
> -  __builtin_prefetch (--p, 0, 0);
> +  __builtin_prefetch (--p, 0, 0, 1);
>     return p == q;
>   }
>   
> @@ -98,7 +98,7 @@ predec_glob_ptr (void)
>   {
>     int *q;
>     q = ptr - 1;
> -  __builtin_prefetch (--ptr, 0, 0);
> +  __builtin_prefetch (--ptr, 0, 0, 1);
>     return ptr == q;
>   }
>   
> @@ -107,7 +107,7 @@ postdec_arg_ptr (int *p)
>   {
>     int *q;
>     q = p - 1;
> -  __builtin_prefetch (p--, 0, 0);
> +  __builtin_prefetch (p--, 0, 0, 1);
>     return p == q;
>   }
>   
> @@ -116,7 +116,7 @@ postdec_glob_ptr (void)
>   {
>     int *q;
>     q = ptr - 1;
> -  __builtin_prefetch (ptr--, 0, 0);
> +  __builtin_prefetch (ptr--, 0, 0, 1);
>     return ptr == q;
>   }
>   
> @@ -124,7 +124,7 @@ int
>   preinc_arg_idx (int *p, int i)
>   {
>     int j = i + 1;
> -  __builtin_prefetch (&p[++i], 0, 0);
> +  __builtin_prefetch (&p[++i], 0, 0, 1);
>     return i == j;
>   }
>   
> @@ -133,7 +133,7 @@ int
>   preinc_glob_idx (void)
>   {
>     int j = arrindex + 1;
> -  __builtin_prefetch (&ptr[++arrindex], 0, 0);
> +  __builtin_prefetch (&ptr[++arrindex], 0, 0, 1);
>     return arrindex == j;
>   }
>   
> @@ -141,7 +141,7 @@ int
>   postinc_arg_idx (int *p, int i)
>   {
>     int j = i + 1;
> -  __builtin_prefetch (&p[i++], 0, 0);
> +  __builtin_prefetch (&p[i++], 0, 0, 1);
>     return i == j;
>   }
>   
> @@ -149,7 +149,7 @@ int
>   postinc_glob_idx (void)
>   {
>     int j = arrindex + 1;
> -  __builtin_prefetch (&ptr[arrindex++], 0, 0);
> +  __builtin_prefetch (&ptr[arrindex++], 0, 0, 1);
>     return arrindex == j;
>   }
>   
> @@ -157,7 +157,7 @@ int
>   predec_arg_idx (int *p, int i)
>   {
>     int j = i - 1;
> -  __builtin_prefetch (&p[--i], 0, 0);
> +  __builtin_prefetch (&p[--i], 0, 0, 1);
>     return i == j;
>   }
>   
> @@ -165,7 +165,7 @@ int
>   predec_glob_idx (void)
>   {
>     int j = arrindex - 1;
> -  __builtin_prefetch (&ptr[--arrindex], 0, 0);
> +  __builtin_prefetch (&ptr[--arrindex], 0, 0, 1);
>     return arrindex == j;
>   }
>   
> @@ -173,7 +173,7 @@ int
>   postdec_arg_idx (int *p, int i)
>   {
>     int j = i - 1;
> -  __builtin_prefetch (&p[i--], 0, 0);
> +  __builtin_prefetch (&p[i--], 0, 0, 1);
>     return i == j;
>   }
>   
> @@ -181,7 +181,7 @@ int
>   postdec_glob_idx (void)
>   {
>     int j = arrindex - 1;
> -  __builtin_prefetch (&ptr[arrindex--], 0, 0);
> +  __builtin_prefetch (&ptr[arrindex--], 0, 0, 1);
>     return arrindex == j;
>   }
>   
> @@ -200,7 +200,7 @@ getptr (int *p)
>   int
>   funccall_arg_ptr (int *p)
>   {
> -  __builtin_prefetch (getptr (p), 0, 0);
> +  __builtin_prefetch (getptr (p), 0, 0, 1);
>     return getptrcnt == 1;
>   }
>   
> @@ -216,7 +216,7 @@ getint (int i)
>   int
>   funccall_arg_idx (int *p, int i)
>   {
> -  __builtin_prefetch (&p[getint (i)], 0, 0);
> +  __builtin_prefetch (&p[getint (i)], 0, 0, 1);
>     return getintcnt == 1;
>   }
>   
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
> index f42a2c0ca87..a6fa1741888 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
> @@ -18,32 +18,32 @@ int idx = 3;
>   void
>   arg_ptr (char *p)
>   {
> -  __builtin_prefetch (p, 0, 0);
> +  __builtin_prefetch (p, 0, 0, 1);
>   }
>   
>   void
>   arg_idx (char *p, int i)
>   {
> -  __builtin_prefetch (&p[i], 0, 0);
> +  __builtin_prefetch (&p[i], 0, 0, 1);
>   }
>   
>   void
>   glob_ptr (void)
>   {
> -  __builtin_prefetch (ptr, 0, 0);
> +  __builtin_prefetch (ptr, 0, 0, 1);
>   }
>   
>   void
>   glob_idx (void)
>   {
> -  __builtin_prefetch (&ptr[idx], 0, 0);
> +  __builtin_prefetch (&ptr[idx], 0, 0, 1);
>   }
>   
>   int
>   main ()
>   {
> -  __builtin_prefetch (&s.b, 0, 0);
> -  __builtin_prefetch (&s.c[1], 0, 0);
> +  __builtin_prefetch (&s.b, 0, 0, 1);
> +  __builtin_prefetch (&s.c[1], 0, 0, 1);
>   
>     arg_ptr (&s.c[1]);
>     arg_ptr (ptr+3);
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
> index f643c5c7286..fabecaf56dc 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
> @@ -25,7 +25,7 @@ prefetch_for_read (void)
>   {
>     int i;
>     for (i = 0; i < ARRSIZE; i++)
> -    __builtin_prefetch (bad_addr[i], 0, 0);
> +    __builtin_prefetch (bad_addr[i], 0, 0, 1);
>   }
>   
>   void
> @@ -33,7 +33,7 @@ prefetch_for_write (void)
>   {
>     int i;
>     for (i = 0; i < ARRSIZE; i++)
> -    __builtin_prefetch (bad_addr[i], 1, 0);
> +    __builtin_prefetch (bad_addr[i], 1, 0, 1);
>   }
>   
>   int
> diff --git a/gcc/testsuite/gcc.dg/builtin-prefetch-1.c b/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
> index 11beb4e1bbe..84d564dc72c 100644
> --- a/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
> +++ b/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
> @@ -1,6 +1,6 @@
>   /* Test that __builtin_prefetch does no harm.
>   
> -   Prefetch using some invalid rw and locality values.  These must be
> +   Prefetch using some invalid cache, rw and locality values.  These must be
>      compile-time constants.  */
>   
>   /* { dg-do run } */
> @@ -9,6 +9,7 @@ extern void exit (int);
>   
>   enum locality { none, low, moderate, high, bogus };
>   enum rw { read, write };
> +enum cache { inst, data };
>   
>   int arr[10];
>   
> @@ -34,6 +35,8 @@ bad (int *p)
>     __builtin_prefetch (p, 0, -1);  /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
>     __builtin_prefetch (p, 0, 4);   /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
>     __builtin_prefetch (p, 0, bogus);   /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
> +  __builtin_prefetch (p, 0, 3, -1);   /* { dg-warning "invalid fourth argument to '__builtin_prefetch'; using one" } */
> +  __builtin_prefetch (p, 0, 3, bogus);   /* { dg-warning "invalid fourth argument to '__builtin_prefetch'; using one" } */
>   }
>   
>   int
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
> index 638749a5a68..eb9197b357c 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
> @@ -9,14 +9,14 @@ char *msg = "howdy there";
>   
>   void foo (char *p)
>   {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>   }
>   
>   int main ()
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
> index d793437f175..b5081815f7a 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
> @@ -10,14 +10,14 @@ char *msg = "howdy there";
>   
>   void foo (char *p)
>   {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>   }
>   
>   int main ()
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
> index 04e814d5a9c..2317f665107 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
> @@ -9,14 +9,14 @@ char *msg = "howdy there";
>   
>   void foo (char *p)
>   {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>   }
>   
>   int main ()
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
> index 3707c7074be..936ad9e79ad 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
> @@ -9,14 +9,14 @@ char *msg = "howdy there";
>   
>   void foo (char *p)
>   {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>   }
>   
>   int main ()
> diff --git a/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c b/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
> new file mode 100644
> index 00000000000..f082396ac2e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/alpha/prefetchi-1.c b/gcc/testsuite/gcc.target/alpha/prefetchi-1.c
> new file mode 100644
> index 00000000000..5d9c387e260
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/alpha/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mcpu=ev6" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/arc/prefetchi-1.c b/gcc/testsuite/gcc.target/arc/prefetchi-1.c
> new file mode 100644
> index 00000000000..7e023ab6498
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arc/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mcpu=archs" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/arm/prefetchi-1.c b/gcc/testsuite/gcc.target/arm/prefetchi-1.c
> new file mode 100644
> index 00000000000..0fbcb7019bc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { ia32 } } } */
> +/* { dg-options "-O2 -march=armv5te" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/hppa/prefetchi-1.c b/gcc/testsuite/gcc.target/hppa/prefetchi-1.c
> new file mode 100644
> index 00000000000..26854a6828d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/hppa/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mpa-risc-2-0" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
> index 051a1b59b5b..ea0b9f6bcef 100644
> --- a/gcc/testsuite/gcc.target/i386/avx-1.c
> +++ b/gcc/testsuite/gcc.target/i386/avx-1.c
> @@ -153,7 +153,7 @@
>   #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
>   
>   /* xmmintrin.h */
> -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
> +#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
>   #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
>   #define __builtin_ia32_vec_set_v4hi(A, D, N) \
>     __builtin_ia32_vec_set_v4hi(A, D, 0)
> diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-1.c b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
> new file mode 100644
> index 00000000000..b32d59f2e5f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad(const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
> index ca662f7bd47..6c9742cf494 100644
> --- a/gcc/testsuite/gcc.target/i386/sse-13.c
> +++ b/gcc/testsuite/gcc.target/i386/sse-13.c
> @@ -125,7 +125,7 @@
>   #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
>   
>   /* xmmintrin.h */
> -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
> +#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
>   #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
>   #define __builtin_ia32_vec_set_v4hi(A, D, N) \
>     __builtin_ia32_vec_set_v4hi(A, D, 0)
> diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
> index ba1310f9f89..344913e9a90 100644
> --- a/gcc/testsuite/gcc.target/i386/sse-23.c
> +++ b/gcc/testsuite/gcc.target/i386/sse-23.c
> @@ -94,7 +94,7 @@
>   #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
>   
>   /* xmmintrin.h */
> -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
> +#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
>   #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
>   #define __builtin_ia32_vec_set_v4hi(A, D, N) \
>     __builtin_ia32_vec_set_v4hi(A, D, 0)
> diff --git a/gcc/testsuite/gcc.target/ia64/prefetchi-1.c b/gcc/testsuite/gcc.target/ia64/prefetchi-1.c
> new file mode 100644
> index 00000000000..f082396ac2e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/ia64/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/mips/prefetchi-1.c b/gcc/testsuite/gcc.target/mips/prefetchi-1.c
> new file mode 100644
> index 00000000000..23e78a0c7ba
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/mips/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mips4 -mexplicit-relocs" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c b/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
> new file mode 100644
> index 00000000000..f082396ac2e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/s390/prefetchi-1.c b/gcc/testsuite/gcc.target/s390/prefetchi-1.c
> new file mode 100644
> index 00000000000..5ef557f1d8c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mzarch -march=z10" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/sh/prefetchi-1.c b/gcc/testsuite/gcc.target/sh/prefetchi-1.c
> new file mode 100644
> index 00000000000..347bdea8df8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/sh/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { has_pref } } } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/sparc/prefetchi-1.c b/gcc/testsuite/gcc.target/sparc/prefetchi-1.c
> new file mode 100644
> index 00000000000..1bd7ad495e2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/sparc/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mcpu=v9" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-17 15:28   ` Richard Earnshaw
@ 2022-10-18  9:20     ` Haochen Jiang
  0 siblings, 0 replies; 26+ messages in thread
From: Haochen Jiang @ 2022-10-18  9:20 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard.Earnshaw

Hi Richard,

This is my new patch and changes the warning message on aarch64/arm.

Ok for trunk?

BRs,
Haochen

gcc/ChangeLog:

	* builtins.cc (expand_builtin_prefetch): Handle the fourth parameter in
	expand function.
	* config/aarch64/aarch64-sve.md: Add default parameter value.
	* config/aarch64/aarch64.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/alpha/alpha.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/arc/arc.md: Add default parameter value.
	* config/arm/arm.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/frv/frv.md: Ditto.
	* config/i386/i386.md: Ditto.
	* config/ia64/ia64.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/mips/mips.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/pa/pa.md: Ditto.
	* config/rs6000/rs6000.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/s390/s390.cc (s390_expand_cpymem): Generate fourth parameter for
	gen_prefetch call.
	(s390_expand_setmem): Ditto.
	(s390_expand_cmpmem): Ditto.
	* config/s390/s390.md (prefetch): New define_expand.
	(*prefetch): Add default parameter value.
	* config/sh/sh.md: Ditto.
	* config/sparc/sparc.md: Ditto.
	* doc/rtl.texi: Document cache variable for prefetch.
	* rtl.def (PREFETCH): Change prefetch DEF_RTL_EXPR to add fourth parameter.
	* rtlanal.cc (setup_reg_subrtx_bounds): Change gcc_checking_assert for
	fourth parameter.
	* target-insns.def (prefetch): Add fourth rtx for prefetch.

gcc/testsuite/ChangeLog:

	* gcc.c-torture/execute/builtin-prefetch-1.c: Add fourth parameter for
	testcases.
	* gcc.c-torture/execute/builtin-prefetch-2.c: Ditto.
	* gcc.c-torture/execute/builtin-prefetch-3.c: Ditto.
	* gcc.c-torture/execute/builtin-prefetch-4.c: Ditto.
	* gcc.c-torture/execute/builtin-prefetch-5.c: Ditto.
	* gcc.c-torture/execute/builtin-prefetch-6.c: Ditto.
	* gcc.dg/builtin-prefetch-1.c: Ditto.
	* gcc.misc-tests/i386-pf-3dnow-1.c: Ditto.
	* gcc.misc-tests/i386-pf-athlon-1.c: Ditto.
	* gcc.misc-tests/i386-pf-none-1.c: Ditto.
	* gcc.misc-tests/i386-pf-sse-1.c: Ditto.
	* gcc.target/i386/avx-1.c: Change prefetch macro define to variable args.
	* gcc.target/i386/sse-13.c: Ditto.
	* gcc.target/i386/sse-23.c: Ditto.
	* gcc.target/aarch64/prefetchi-1.c: New test.
	* gcc.target/alpha/prefetchi-1.c: Ditto.
	* gcc.target/arc/prefetchi-1.c: Ditto.
	* gcc.target/arm/prefetchi-1.c: Ditto.
	* gcc.target/hppa/prefetchi-1.c: Ditto.
	* gcc.target/i386/prefetchi-1.c: Ditto.
	* gcc.target/ia64/prefetchi-1.c: Ditto.
	* gcc.target/mips/prefetchi-1.c: Ditto.
	* gcc.target/powerpc/prefetchi-1.c: Ditto.
	* gcc.target/s390/prefetchi-1.c: Ditto.
	* gcc.target/sh/prefetchi-1.c: Ditto.
	* gcc.target/sparc/prefetchi-1.c: Ditto.
---
 gcc/builtins.cc                               |  34 ++++--
 gcc/config/aarch64/aarch64-sve.md             |  15 ++-
 gcc/config/aarch64/aarch64.md                 |  19 +++-
 gcc/config/alpha/alpha.md                     |  19 +++-
 gcc/config/arc/arc.md                         |  20 +++-
 gcc/config/arm/arm.md                         |  19 +++-
 gcc/config/frv/frv.md                         |   6 +-
 gcc/config/i386/i386.md                       |  17 ++-
 gcc/config/ia64/ia64.md                       |  19 +++-
 gcc/config/mips/mips.md                       |  22 +++-
 gcc/config/pa/pa.md                           |  12 +-
 gcc/config/rs6000/rs6000.md                   |  19 +++-
 gcc/config/s390/s390.cc                       |  10 +-
 gcc/config/s390/s390.md                       |  19 +++-
 gcc/config/sh/sh.md                           |  15 ++-
 gcc/config/sparc/sparc.md                     |  15 ++-
 gcc/doc/rtl.texi                              |   6 +-
 gcc/rtl.def                                   |   5 +-
 gcc/rtlanal.cc                                |   2 +-
 gcc/target-insns.def                          |   2 +-
 .../execute/builtin-prefetch-1.c              |  45 ++++----
 .../execute/builtin-prefetch-2.c              | 106 +++++++++---------
 .../execute/builtin-prefetch-3.c              |  92 +++++++--------
 .../execute/builtin-prefetch-4.c              |  44 ++++----
 .../execute/builtin-prefetch-5.c              |  12 +-
 .../execute/builtin-prefetch-6.c              |   4 +-
 gcc/testsuite/gcc.dg/builtin-prefetch-1.c     |   5 +-
 .../gcc.misc-tests/i386-pf-3dnow-1.c          |  16 +--
 .../gcc.misc-tests/i386-pf-athlon-1.c         |  16 +--
 gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c |  16 +--
 gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c  |  16 +--
 .../gcc.target/aarch64/prefetchi-1.c          |  11 ++
 gcc/testsuite/gcc.target/alpha/prefetchi-1.c  |  11 ++
 gcc/testsuite/gcc.target/arc/prefetchi-1.c    |  11 ++
 gcc/testsuite/gcc.target/arm/prefetchi-1.c    |  11 ++
 gcc/testsuite/gcc.target/hppa/prefetchi-1.c   |  11 ++
 gcc/testsuite/gcc.target/i386/avx-1.c         |   2 +-
 gcc/testsuite/gcc.target/i386/prefetchi-1.c   |  11 ++
 gcc/testsuite/gcc.target/i386/sse-13.c        |   2 +-
 gcc/testsuite/gcc.target/i386/sse-23.c        |   2 +-
 gcc/testsuite/gcc.target/ia64/prefetchi-1.c   |  11 ++
 gcc/testsuite/gcc.target/mips/prefetchi-1.c   |  11 ++
 .../gcc.target/powerpc/prefetchi-1.c          |  11 ++
 gcc/testsuite/gcc.target/s390/prefetchi-1.c   |  11 ++
 gcc/testsuite/gcc.target/sh/prefetchi-1.c     |  11 ++
 gcc/testsuite/gcc.target/sparc/prefetchi-1.c  |  11 ++
 46 files changed, 564 insertions(+), 241 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/alpha/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/arc/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/hppa/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/ia64/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/mips/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/sh/prefetchi-1.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/prefetchi-1.c

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 5f319b28030..2e6d0c76beb 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -1282,18 +1282,18 @@ expand_builtin_update_setjmp_buf (rtx buf_addr)
 static void
 expand_builtin_prefetch (tree exp)
 {
-  tree arg0, arg1, arg2;
+  tree arg0, arg1, arg2, arg3;
   int nargs;
-  rtx op0, op1, op2;
+  rtx op0, op1, op2, op3;
 
   if (!validate_arglist (exp, POINTER_TYPE, 0))
     return;
 
   arg0 = CALL_EXPR_ARG (exp, 0);
 
-  /* Arguments 1 and 2 are optional; argument 1 (read/write) defaults to
-     zero (read) and argument 2 (locality) defaults to 3 (high degree of
-     locality).  */
+  /* Arguments 1, 2, 3 are optional; argument 1 (read/write) defaults to
+     zero (read); argument 2 (locality) defaults to 3 (high degree of
+     locality); argument 3 (cache type) defaults to 1 (data).  */
   nargs = call_expr_nargs (exp);
   if (nargs > 1)
     arg1 = CALL_EXPR_ARG (exp, 1);
@@ -1303,6 +1303,10 @@ expand_builtin_prefetch (tree exp)
     arg2 = CALL_EXPR_ARG (exp, 2);
   else
     arg2 = integer_three_node;
+  if (nargs > 3)
+    arg3 = CALL_EXPR_ARG (exp, 3);
+  else
+    arg3 = integer_one_node;
 
   /* Argument 0 is an address.  */
   op0 = expand_expr (arg0, NULL_RTX, Pmode, EXPAND_NORMAL);
@@ -1336,14 +1340,30 @@ expand_builtin_prefetch (tree exp)
       op2 = const0_rtx;
     }
 
+  /* Argument 3 (cache type) must be a compile-time constant int.  */
+  if (TREE_CODE (arg3) != INTEGER_CST)
+    {
+      error ("fourth argument to %<__builtin_prefetch%> must be a constant");
+      arg3 = integer_one_node;
+    }
+  op3 = expand_normal (arg3);
+  /* Argument 3 must be either zero or one.  */
+  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
+    {
+      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
+	" using one");
+      op3 = const1_rtx;
+    }
+
   if (targetm.have_prefetch ())
     {
-      class expand_operand ops[3];
+      class expand_operand ops[4];
 
       create_address_operand (&ops[0], op0);
       create_integer_operand (&ops[1], INTVAL (op1));
       create_integer_operand (&ops[2], INTVAL (op2));
-      if (maybe_expand_insn (targetm.code_for_prefetch, 3, ops))
+      create_integer_operand (&ops[3], INTVAL (op3));
+      if (maybe_expand_insn (targetm.code_for_prefetch, 4, ops))
 	return;
     }
 
diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md
index e08bee197d8..0cde862bc04 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -1944,7 +1944,8 @@
 		(match_operand:DI 2 "const_int_operand")]
 	       UNSPEC_SVE_PREFETCH)
 	     (match_operand:DI 3 "const_int_operand")
-	     (match_operand:DI 4 "const_int_operand"))]
+	     (match_operand:DI 4 "const_int_operand")
+	     (const_int 1))]
   "TARGET_SVE"
   {
     operands[1] = gen_rtx_MEM (<MODE>mode, operands[1]);
@@ -1984,7 +1985,8 @@
 		(match_operand:DI 6 "const_int_operand")]
 	       UNSPEC_SVE_PREFETCH_GATHER)
 	     (match_operand:DI 7 "const_int_operand")
-	     (match_operand:DI 8 "const_int_operand"))]
+	     (match_operand:DI 8 "const_int_operand")
+	     (const_int 1))]
   "TARGET_SVE"
   {
     static const char *const insns[][2] = {
@@ -2013,7 +2015,8 @@
 		(match_operand:DI 6 "const_int_operand")]
 	       UNSPEC_SVE_PREFETCH_GATHER)
 	     (match_operand:DI 7 "const_int_operand")
-	     (match_operand:DI 8 "const_int_operand"))]
+	     (match_operand:DI 8 "const_int_operand")
+	     (const_int 1))]
   "TARGET_SVE"
   {
     static const char *const insns[][2] = {
@@ -2044,7 +2047,8 @@
 		(match_operand:DI 6 "const_int_operand")]
 	       UNSPEC_SVE_PREFETCH_GATHER)
 	     (match_operand:DI 7 "const_int_operand")
-	     (match_operand:DI 8 "const_int_operand"))]
+	     (match_operand:DI 8 "const_int_operand")
+	     (const_int 1))]
   "TARGET_SVE"
   {
     static const char *const insns[][2] = {
@@ -2074,7 +2078,8 @@
 		(match_operand:DI 6 "const_int_operand")]
 	       UNSPEC_SVE_PREFETCH_GATHER)
 	     (match_operand:DI 7 "const_int_operand")
-	     (match_operand:DI 8 "const_int_operand"))]
+	     (match_operand:DI 8 "const_int_operand")
+	     (const_int 1))]
   "TARGET_SVE"
   {
     static const char *const insns[][2] = {
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index f2e3d905dbb..9395df7164e 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -818,10 +818,25 @@
   [(set_attr "type" "no_insn")]
 )
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand:DI 0 "aarch64_prefetch_operand")
+            (match_operand:QI 1 "const_int_operand")
+            (match_operand:QI 2 "const_int_operand")
+	    (match_operand:QI 3 "const_int_operand"))]
+  ""
+  {
+    if (INTVAL (operands[3]) == 0)
+    {
+      warning (0, "instruction prefetch is not yet implemented; using data prefetch");
+      operands[3] = const1_rtx;
+    }
+  })
+
+(define_insn "*prefetch"
   [(prefetch (match_operand:DI 0 "aarch64_prefetch_operand" "Dp")
             (match_operand:QI 1 "const_int_operand" "")
-            (match_operand:QI 2 "const_int_operand" ""))]
+            (match_operand:QI 2 "const_int_operand" "")
+	    (const_int 1))]
   ""
   {
     const char * pftype[2][4] =
diff --git a/gcc/config/alpha/alpha.md b/gcc/config/alpha/alpha.md
index 87514330c22..46fd6a7b7cb 100644
--- a/gcc/config/alpha/alpha.md
+++ b/gcc/config/alpha/alpha.md
@@ -5176,10 +5176,25 @@
 ;;
 ;; On EV6, these become official prefetch instructions.
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand:DI 0 "address_operand")
+	     (match_operand:DI 1 "const_int_operand")
+	     (match_operand:DI 2 "const_int_operand")
+	     (match_operand:DI 3 "const_int_operand"))]
+  "TARGET_FIXUP_EV5_PREFETCH || alpha_cpu == PROCESSOR_EV6"
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
+
+(define_insn "*prefetch"
   [(prefetch (match_operand:DI 0 "address_operand" "p")
 	     (match_operand:DI 1 "const_int_operand" "n")
-	     (match_operand:DI 2 "const_int_operand" "n"))]
+	     (match_operand:DI 2 "const_int_operand" "n")
+	     (const_int 1))]
   "TARGET_FIXUP_EV5_PREFETCH || alpha_cpu == PROCESSOR_EV6"
 {
   /* Interpret "no temporal locality" as this data should be evicted once
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 458d3edf716..9607a0dd572 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -5255,14 +5255,22 @@ archs4x, archs4xd"
 (define_expand "prefetch"
   [(prefetch (match_operand:SI 0 "address_operand" "")
 	     (match_operand:SI 1 "const_int_operand" "")
-	     (match_operand:SI 2 "const_int_operand" ""))]
+	     (match_operand:SI 2 "const_int_operand" "")
+	     (match_operand:SI 3 "const_int_operand" ""))]
   "TARGET_HS"
-  "")
+  {
+    if (INTVAL (operands[3]) == 0)
+    {
+      warning (0, "instruction prefetch is not supported; using data prefetch");
+      operands[3] = const1_rtx;
+    }
+  })
 
 (define_insn "prefetch_1"
   [(prefetch (match_operand:SI 0 "register_operand" "r")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   "TARGET_HS"
   {
    if (INTVAL (operands[1]))
@@ -5277,7 +5285,8 @@ archs4x, archs4xd"
   [(prefetch (plus:SI (match_operand:SI 0 "register_operand" "r,r,r")
 		      (match_operand:SI 1 "nonmemory_operand" "r,Cm2,Cal"))
 	     (match_operand:SI 2 "const_int_operand" "n,n,n")
-	     (match_operand:SI 3 "const_int_operand" "n,n,n"))]
+	     (match_operand:SI 3 "const_int_operand" "n,n,n")
+	     (const_int 1))]
   "TARGET_HS"
   {
    if (INTVAL (operands[2]))
@@ -5291,7 +5300,8 @@ archs4x, archs4xd"
 (define_insn "prefetch_3"
   [(prefetch (match_operand:SI 0 "address_operand" "p")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   "TARGET_HS"
   {
    operands[0] = gen_rtx_MEM (SImode, operands[0]);
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 69bf343fb0e..c4b940a443a 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -12206,10 +12206,25 @@
 
 ;; V5E instructions.
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand:SI 0 "address_operand")
+	     (match_operand:SI 1 "")
+	     (match_operand:SI 2 "")
+	     (match_operand:SI 3 ""))]
+  "TARGET_32BIT && arm_arch5te"
+  {
+    if (INTVAL (operands[3]) == 0)
+    {
+      warning (0, "instruction prefetch is not yet implemented; using data prefetch");
+      operands[3] = const1_rtx;
+    }
+  })
+
+(define_insn "*prefetch"
   [(prefetch (match_operand:SI 0 "address_operand" "p")
 	     (match_operand:SI 1 "" "")
-	     (match_operand:SI 2 "" ""))]
+	     (match_operand:SI 2 "" "")
+	     (const_int 1))]
   "TARGET_32BIT && arm_arch5te"
   "pld\\t%a0"
   [(set_attr "type" "load_4")]
diff --git a/gcc/config/frv/frv.md b/gcc/config/frv/frv.md
index 6258fe3b99e..2fb9de593c9 100644
--- a/gcc/config/frv/frv.md
+++ b/gcc/config/frv/frv.md
@@ -7631,7 +7631,8 @@
   [(prefetch (unspec:SI [(match_operand:SI 0 "register_operand" "r")]
 			UNSPEC_PREFETCH0)
 	     (const_int 0)
-	     (const_int 0))]
+	     (const_int 0)
+	     (const_int 1))]
   ""
   "dcpl %0, gr0, #0"
   [(set_attr "length" "4")])
@@ -7640,7 +7641,8 @@
   [(prefetch (unspec:SI [(match_operand:SI 0 "register_operand" "r")]
 			UNSPEC_PREFETCH)
 	     (const_int 0)
-	     (const_int 0))]
+	     (const_int 0)
+	     (const_int 1))]
   "TARGET_FR500_FR550_BUILTINS"
   "nop.p\\n\\tnldub @(%0, gr0), gr0"
   [(set_attr "length" "8")])
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6688d92b63c..f63986587f4 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -23716,9 +23716,15 @@
 (define_expand "prefetch"
   [(prefetch (match_operand 0 "address_operand")
 	     (match_operand:SI 1 "const_int_operand")
-	     (match_operand:SI 2 "const_int_operand"))]
+	     (match_operand:SI 2 "const_int_operand")
+	     (match_operand:SI 3 "const_int_operand"))]
   "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1"
 {
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
   bool write = operands[1] != const0_rtx;
   int locality = INTVAL (operands[2]);
 
@@ -23760,7 +23766,8 @@
 (define_insn "*prefetch_sse"
   [(prefetch (match_operand 0 "address_operand" "p")
 	     (const_int 0)
-	     (match_operand:SI 1 "const_int_operand"))]
+	     (match_operand:SI 1 "const_int_operand")
+	     (const_int 1))]
   "TARGET_PREFETCH_SSE"
 {
   static const char * const patterns[4] = {
@@ -23781,7 +23788,8 @@
 (define_insn "*prefetch_3dnow"
   [(prefetch (match_operand 0 "address_operand" "p")
 	     (match_operand:SI 1 "const_int_operand")
-	     (const_int 3))]
+	     (const_int 3)
+	     (const_int 1))]
   "TARGET_3DNOW || TARGET_PRFCHW || TARGET_PREFETCHWT1"
 {
   if (operands[1] == const0_rtx)
@@ -23797,7 +23805,8 @@
 (define_insn "*prefetch_prefetchwt1"
   [(prefetch (match_operand 0 "address_operand" "p")
 	     (const_int 1)
-	     (const_int 2))]
+	     (const_int 2)
+	     (const_int 1))]
   "TARGET_PREFETCHWT1"
   "prefetchwt1\t%a0";
   [(set_attr "type" "sse")
diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
index 5d1d47da55b..9fbbea3412a 100644
--- a/gcc/config/ia64/ia64.md
+++ b/gcc/config/ia64/ia64.md
@@ -5018,10 +5018,25 @@
   "break.f 0"
   [(set_attr "itanium_class" "nop_f")])
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand:DI 0 "address_operand")
+	     (match_operand:DI 1 "const_int_operand")
+	     (match_operand:DI 2 "const_int_operand")
+	     (match_operand:DI 3 "const_int_operand"))]
+  ""
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
+
+(define_insn "*prefetch"
   [(prefetch (match_operand:DI 0 "address_operand" "p")
 	     (match_operand:DI 1 "const_int_operand" "n")
-	     (match_operand:DI 2 "const_int_operand" "n"))]
+	     (match_operand:DI 2 "const_int_operand" "n")
+	     (const_int 1))]
   ""
 {
   static const char * const alt[2][4] = {
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index e0f0a582732..b5c547806b4 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -7227,10 +7227,25 @@
 ;;
 
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand:QI 0 "address_operand")
+	     (match_operand 1 "const_int_operand")
+	     (match_operand 2 "const_int_operand")
+	     (match_operand 3 "const_int_operand"))]
+  "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
+
+(define_insn "*prefetch"
   [(prefetch (match_operand:QI 0 "address_operand" "ZD")
 	     (match_operand 1 "const_int_operand" "n")
-	     (match_operand 2 "const_int_operand" "n"))]
+	     (match_operand 2 "const_int_operand" "n")
+	     (const_int 1))]
   "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
 {
   if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT)
@@ -7257,7 +7272,8 @@
   [(prefetch (plus:P (match_operand:P 0 "register_operand" "d")
 		     (match_operand:P 1 "register_operand" "d"))
 	     (match_operand 2 "const_int_operand" "n")
-	     (match_operand 3 "const_int_operand" "n"))]
+	     (match_operand 3 "const_int_operand" "n")
+	     (const_int 1))]
   "ISA_HAS_PREFETCHX && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT"
 {
   if (TARGET_LOONGSON_EXT)
diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index 76ae35d4cfa..a7469074c01 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -10201,9 +10201,16 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
 (define_expand "prefetch"
   [(match_operand 0 "address_operand" "")
    (match_operand 1 "const_int_operand" "")
-   (match_operand 2 "const_int_operand" "")]
+   (match_operand 2 "const_int_operand" "")
+   (match_operand 3 "const_int_operand" "")]
   "TARGET_PA_20"
 {
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+
   operands[0] = copy_addr_to_reg (operands[0]);
   emit_insn (gen_prefetch_20 (operands[0], operands[1], operands[2]));
   DONE;
@@ -10212,7 +10219,8 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
 (define_insn "prefetch_20"
   [(prefetch (match_operand 0 "pmode_register_operand" "r")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   "TARGET_PA_20"
 {
   /* The SL cache-control completer indicates good spatial locality but
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ad5a4cf2ef8..21ff09eca93 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14060,10 +14060,25 @@
   DONE;
 })
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand 0 "indexed_or_indirect_address")
+	     (match_operand:SI 1 "const_int_operand")
+	     (match_operand:SI 2 "const_int_operand")
+	     (match_operand:SI 3 "const_int_operand"))]
+  ""
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
+
+(define_insn "*prefetch"
   [(prefetch (match_operand 0 "indexed_or_indirect_address" "a")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   ""
 {
 
diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index ae309471f04..3fc5ae196b8 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -5697,13 +5697,13 @@ s390_expand_cpymem (rtx dst, rtx src, rtx len)
 
 	  /* Issue a read prefetch for the +3 cache line.  */
 	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, src_addr, GEN_INT (768)),
-				   const0_rtx, const0_rtx);
+				   const0_rtx, const0_rtx, const1_rtx);
 	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
 	  emit_insn (prefetch);
 
 	  /* Issue a write prefetch for the +3 cache line.  */
 	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, GEN_INT (768)),
-				   const1_rtx, const0_rtx);
+				   const1_rtx, const0_rtx, const1_rtx);
 	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
 	  emit_insn (prefetch);
 	}
@@ -5872,7 +5872,7 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
 	  /* Issue a write prefetch.  */
 	  rtx distance = GEN_INT (TARGET_SETMEM_PREFETCH_DISTANCE);
 	  rtx prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, distance),
-				       const1_rtx, const0_rtx);
+				       const1_rtx, const0_rtx, const1_rtx);
 	  emit_insn (prefetch);
 	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
 	}
@@ -5999,13 +5999,13 @@ s390_expand_cmpmem (rtx target, rtx op0, rtx op1, rtx len)
 
 	  /* Issue a read prefetch for the +2 cache line of operand 1.  */
 	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, addr0, GEN_INT (512)),
-				   const0_rtx, const0_rtx);
+				   const0_rtx, const0_rtx, const1_rtx);
 	  emit_insn (prefetch);
 	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
 
 	  /* Issue a read prefetch for the +2 cache line of operand 2.  */
 	  prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, addr1, GEN_INT (512)),
-				   const0_rtx, const0_rtx);
+				   const0_rtx, const0_rtx, const1_rtx);
 	  emit_insn (prefetch);
 	  PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
 	}
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 962927c3112..4b094aa2bcf 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -11601,10 +11601,25 @@
 ; Data prefetch patterns
 ;
 
-(define_insn "prefetch"
+(define_expand "prefetch"
+  [(prefetch (match_operand 0    "address_operand")
+	     (match_operand:SI 1 "const_int_operand")
+	     (match_operand:SI 2 "const_int_operand")
+             (match_operand:SI 3 "const_int_operand"))]
+  "TARGET_Z10"
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
+
+(define_insn "*prefetch"
   [(prefetch (match_operand 0    "address_operand"   "ZT,X")
 	     (match_operand:SI 1 "const_int_operand" " n,n")
-	     (match_operand:SI 2 "const_int_operand" " n,n"))]
+	     (match_operand:SI 2 "const_int_operand" " n,n")
+             (const_int 1))]
   "TARGET_Z10"
 {
   switch (which_alternative)
diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
index 59a7b216433..54a8270e80e 100644
--- a/gcc/config/sh/sh.md
+++ b/gcc/config/sh/sh.md
@@ -10928,13 +10928,22 @@
 (define_expand "prefetch"
   [(prefetch (match_operand 0 "address_operand" "")
 	     (match_operand:SI 1 "const_int_operand" "")
-	     (match_operand:SI 2 "const_int_operand" ""))]
-  "(TARGET_SH2A || TARGET_SH3) && !TARGET_VXWORKS_RTP")
+	     (match_operand:SI 2 "const_int_operand" "")
+	     (match_operand:SI 3 "const_int_operand" ""))]
+  "(TARGET_SH2A || TARGET_SH3) && !TARGET_VXWORKS_RTP"
+{
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+})
 
 (define_insn "*prefetch"
   [(prefetch (match_operand:SI 0 "register_operand" "r")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   "(TARGET_SH2A || TARGET_SH3) && ! TARGET_VXWORKS_RTP"
   "pref	@%0"
   [(set_attr "type" "other")])
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 691e707863a..04cb6935b1b 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -7816,9 +7816,16 @@ visl")
 (define_expand "prefetch"
   [(match_operand 0 "address_operand" "")
    (match_operand 1 "const_int_operand" "")
-   (match_operand 2 "const_int_operand" "")]
+   (match_operand 2 "const_int_operand" "")
+   (match_operand 3 "const_int_operand" "")]
   "TARGET_V9"
 {
+  if (INTVAL (operands[3]) == 0)
+  {
+    warning (0, "instruction prefetch is not supported; using data prefetch");
+    operands[3] = const1_rtx;
+  }
+
   if (TARGET_ARCH64)
     emit_insn (gen_prefetch_64 (operands[0], operands[1], operands[2]));
   else
@@ -7829,7 +7836,8 @@ visl")
 (define_insn "prefetch_64"
   [(prefetch (match_operand:DI 0 "address_operand" "p")
 	     (match_operand:DI 1 "const_int_operand" "n")
-	     (match_operand:DI 2 "const_int_operand" "n"))]
+	     (match_operand:DI 2 "const_int_operand" "n")
+	     (const_int 1))]
   ""
 {
   static const char * const prefetch_instr[2][2] = {
@@ -7855,7 +7863,8 @@ visl")
 (define_insn "prefetch_32"
   [(prefetch (match_operand:SI 0 "address_operand" "p")
 	     (match_operand:SI 1 "const_int_operand" "n")
-	     (match_operand:SI 2 "const_int_operand" "n"))]
+	     (match_operand:SI 2 "const_int_operand" "n")
+	     (const_int 1))]
   ""
 {
   static const char * const prefetch_instr[2][2] = {
diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 43c9ee8bffe..592f4b0e4dd 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -3454,7 +3454,7 @@ position of @var{base}, @var{min} and @var{max} to the containing insn
 and of @var{min} and @var{max} to @var{base}.  See rtl.def for details.
 
 @findex prefetch
-@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality})
+@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality} @var{cache})
 Represents prefetch of memory at address @var{addr}.
 Operand @var{rw} is 1 if the prefetch is for data to be written, 0 otherwise;
 targets that do not support write prefetches should treat this as a normal
@@ -3462,6 +3462,10 @@ prefetch.
 Operand @var{locality} specifies the amount of temporal locality; 0 if there
 is none or 1, 2, or 3 for increasing levels of temporal locality;
 targets that do not support locality hints should ignore this.
+Operand @var{cache} is 1 if the prefetch is prefetching data, 0 for prefetching
+instruction;
+targets that do not support instruction prefetch should treat all as data
+prefetch.
 
 This insn is used to minimize cache-miss latency by moving data into a
 cache before it is accessed.  It should use only non-faulting data prefetch
diff --git a/gcc/rtl.def b/gcc/rtl.def
index 08e31fa3544..f2e37d55023 100644
--- a/gcc/rtl.def
+++ b/gcc/rtl.def
@@ -277,10 +277,11 @@ DEF_RTL_EXPR(ADDR_DIFF_VEC, "addr_diff_vec", "eEee0", RTX_EXTRA)
    Operand 3 is the level of temporal locality; 0 means there is no
    temporal locality and 1, 2, and 3 are for increasing levels of temporal
    locality.
+   Operand 4 is 1 for prefetch data, 0 for prefetch instrction.
 
-   The attributes specified by operands 2 and 3 are ignored for targets
+   The attributes specified by operands 2, 3 and 4 are ignored for targets
    whose prefetch instructions do not support them.  */
-DEF_RTL_EXPR(PREFETCH, "prefetch", "eee", RTX_EXTRA)
+DEF_RTL_EXPR(PREFETCH, "prefetch", "eeee", RTX_EXTRA)
 
 /* ----------------------------------------------------------------------
    At the top level of an instruction (perhaps under PARALLEL).
diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index 56da7435a28..7eeef285f1e 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -6196,7 +6196,7 @@ setup_reg_subrtx_bounds (unsigned int code)
   while (format[i] == 'e');
   rtx_all_subrtx_bounds[code].count = i - rtx_all_subrtx_bounds[code].start;
   /* rtl-iter.h relies on this.  */
-  gcc_checking_assert (rtx_all_subrtx_bounds[code].count <= 3);
+  gcc_checking_assert (rtx_all_subrtx_bounds[code].count <= 4);
 
   for (; format[i]; ++i)
     if (format[i] == 'E' || format[i] == 'V' || format[i] == 'e')
diff --git a/gcc/target-insns.def b/gcc/target-insns.def
index de8c0092f98..ca13d1c4393 100644
--- a/gcc/target-insns.def
+++ b/gcc/target-insns.def
@@ -76,7 +76,7 @@ DEF_TARGET_INSN (omp_simt_ordered, (rtx x0, rtx x1))
 DEF_TARGET_INSN (omp_simt_vote_any, (rtx x0, rtx x1))
 DEF_TARGET_INSN (omp_simt_xchg_bfly, (rtx x0, rtx x1, rtx x2))
 DEF_TARGET_INSN (omp_simt_xchg_idx, (rtx x0, rtx x1, rtx x2))
-DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2))
+DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2, rtx x3))
 DEF_TARGET_INSN (probe_stack, (rtx x0))
 DEF_TARGET_INSN (probe_stack_address, (rtx x0))
 DEF_TARGET_INSN (prologue, (void))
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
index 4ee05a94d9f..ccc5fab15e5 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
@@ -1,57 +1,62 @@
 /* Test that __builtin_prefetch does no harm.
 
-   Prefetch using all valid combinations of rw and locality values.
+   Prefetch using all valid combinations of cache, rw and locality values.
    These must be compile-time constants.  */
 
 #define NO_TEMPORAL_LOCALITY 0
 #define LOW_TEMPORAL_LOCALITY 1
-#define MODERATE_TEMPORAL_LOCALITY 1
+#define MODERATE_TEMPORAL_LOCALITY 2
 #define HIGH_TEMPORAL_LOCALITY 3
 
 #define WRITE_ACCESS 1
 #define READ_ACCESS 0
 
+#define DATA_PRFCH 1
+#define INST_PRFCH 0
+
 enum locality { none, low, moderate, high };
 enum rw { read, write };
+enum cache { inst, data };
 
 int arr[10];
 
 void
 good_const (const int *p)
 {
-  __builtin_prefetch (p, 0, 0);
-  __builtin_prefetch (p, 0, 1);
-  __builtin_prefetch (p, 0, 2);
-  __builtin_prefetch (p, READ_ACCESS, 3);
-  __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY);
-  __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY);
-  __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY);
-  __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY);
+  __builtin_prefetch (p, 0, 0, 1);
+  __builtin_prefetch (p, 0, 1, 1);
+  __builtin_prefetch (p, 0, 2, 1);
+  __builtin_prefetch (p, READ_ACCESS, 3, 1);
+  __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY, 1);
+  __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY, 1);
+  __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY, 1);
+  __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY, DATA_PRFCH);
 }
 
 void
 good_enum (const int *p)
 {
-    __builtin_prefetch (p, read, none);
-    __builtin_prefetch (p, read, low);
-    __builtin_prefetch (p, read, moderate);
-    __builtin_prefetch (p, read, high);
-    __builtin_prefetch (p, write, none);
-    __builtin_prefetch (p, write, low);
-    __builtin_prefetch (p, write, moderate);
-    __builtin_prefetch (p, write, high);
+    __builtin_prefetch (p, read, none, data);
+    __builtin_prefetch (p, read, low, data);
+    __builtin_prefetch (p, read, moderate, data);
+    __builtin_prefetch (p, read, high, data);
+    __builtin_prefetch (p, write, none, data);
+    __builtin_prefetch (p, write, low, data);
+    __builtin_prefetch (p, write, moderate, data);
+    __builtin_prefetch (p, write, high, data);
 }
 
 void
 good_expr (const int *p)
 {
-  __builtin_prefetch (p, 1 - 1, 6 - (2 * 3));
-  __builtin_prefetch (p, 1 + 0, 1 + 2);
+  __builtin_prefetch (p, 1 - 1, 6 - (2 * 3), 1 + 0);
+  __builtin_prefetch (p, 1 + 0, 1 + 2, 0 + 1);
 }
 
 void
 good_vararg (const int *p)
 {
+  __builtin_prefetch (p, 0, 3, 1);
   __builtin_prefetch (p, 0, 3);
   __builtin_prefetch (p, 0);
   __builtin_prefetch (p, 1);
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
index 530a1b0ef0d..6aff1f281e0 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
@@ -26,9 +26,9 @@ struct S *ptr_str = &str;
 void
 simple_global ()
 {
-  __builtin_prefetch (glob_int_arr, 0, 0);
-  __builtin_prefetch (glob_ptr_int, 0, 0);
-  __builtin_prefetch (&glob_int, 0, 0);
+  __builtin_prefetch (glob_int_arr, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_int, 0, 0, 1);
+  __builtin_prefetch (&glob_int, 0, 0, 1);
 }
 
 /* Prefetch file-level static variables using the address of the variable.  */
@@ -36,9 +36,9 @@ simple_global ()
 void
 simple_file ()
 {
-  __builtin_prefetch (stat_int_arr, 0, 0);
-  __builtin_prefetch (stat_ptr_int, 0, 0);
-  __builtin_prefetch (&stat_int, 0, 0);
+  __builtin_prefetch (stat_int_arr, 0, 0, 1);
+  __builtin_prefetch (stat_ptr_int, 0, 0, 1);
+  __builtin_prefetch (&stat_int, 0, 0, 1);
 }
 
 /* Prefetch local static variables using the address of the variable.  */
@@ -49,9 +49,9 @@ simple_static_local ()
   static int gx[100];
   static int *hx = gx;
   static int ix;
-  __builtin_prefetch (gx, 0, 0);
-  __builtin_prefetch (hx, 0, 0);
-  __builtin_prefetch (&ix, 0, 0);
+  __builtin_prefetch (gx, 0, 0, 1);
+  __builtin_prefetch (hx, 0, 0, 1);
+  __builtin_prefetch (&ix, 0, 0, 1);
 }
 
 /* Prefetch local stack variables using the address of the variable.  */
@@ -62,9 +62,9 @@ simple_local ()
   int gx[100];
   int *hx = gx;
   int ix;
-  __builtin_prefetch (gx, 0, 0);
-  __builtin_prefetch (hx, 0, 0);
-  __builtin_prefetch (&ix, 0, 0);
+  __builtin_prefetch (gx, 0, 0, 1);
+  __builtin_prefetch (hx, 0, 0, 1);
+  __builtin_prefetch (&ix, 0, 0, 1);
 }
 
 /* Prefetch arguments using the address of the variable.  */
@@ -72,9 +72,9 @@ simple_local ()
 void
 simple_arg (int g[100], int *h, int i)
 {
-  __builtin_prefetch (g, 0, 0);
-  __builtin_prefetch (h, 0, 0);
-  __builtin_prefetch (&i, 0, 0);
+  __builtin_prefetch (g, 0, 0, 1);
+  __builtin_prefetch (h, 0, 0, 1);
+  __builtin_prefetch (&i, 0, 0, 1);
 }
 
 /* Prefetch using address expressions involving global variables.  */
@@ -82,25 +82,25 @@ simple_arg (int g[100], int *h, int i)
 void
 expr_global (void)
 {
-  __builtin_prefetch (&str, 0, 0);
-  __builtin_prefetch (ptr_str, 0, 0);
-  __builtin_prefetch (&str.b, 0, 0);
-  __builtin_prefetch (&ptr_str->b, 0, 0);
-  __builtin_prefetch (&str.d, 0, 0);
-  __builtin_prefetch (&ptr_str->d, 0, 0);
-  __builtin_prefetch (str.next, 0, 0);
-  __builtin_prefetch (ptr_str->next, 0, 0);
-  __builtin_prefetch (str.next->d, 0, 0);
-  __builtin_prefetch (ptr_str->next->d, 0, 0);
-
-  __builtin_prefetch (&glob_int_arr, 0, 0);
-  __builtin_prefetch (glob_ptr_int, 0, 0);
-  __builtin_prefetch (&glob_int_arr[2], 0, 0);
-  __builtin_prefetch (&glob_ptr_int[3], 0, 0);
-  __builtin_prefetch (glob_int_arr+3, 0, 0);
-  __builtin_prefetch (glob_int_arr+glob_int, 0, 0);
-  __builtin_prefetch (glob_ptr_int+5, 0, 0);
-  __builtin_prefetch (glob_ptr_int+glob_int, 0, 0);
+  __builtin_prefetch (&str, 0, 0, 1);
+  __builtin_prefetch (ptr_str, 0, 0, 1);
+  __builtin_prefetch (&str.b, 0, 0, 1);
+  __builtin_prefetch (&ptr_str->b, 0, 0, 1);
+  __builtin_prefetch (&str.d, 0, 0, 1);
+  __builtin_prefetch (&ptr_str->d, 0, 0, 1);
+  __builtin_prefetch (str.next, 0, 0, 1);
+  __builtin_prefetch (ptr_str->next, 0, 0, 1);
+  __builtin_prefetch (str.next->d, 0, 0, 1);
+  __builtin_prefetch (ptr_str->next->d, 0, 0, 1);
+
+  __builtin_prefetch (&glob_int_arr, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_int, 0, 0, 1);
+  __builtin_prefetch (&glob_int_arr[2], 0, 0, 1);
+  __builtin_prefetch (&glob_ptr_int[3], 0, 0, 1);
+  __builtin_prefetch (glob_int_arr+3, 0, 0, 1);
+  __builtin_prefetch (glob_int_arr+glob_int, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_int+5, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_int+glob_int, 0, 0, 1);
 }
 
 /* Prefetch using address expressions involving local variables.  */
@@ -114,25 +114,25 @@ expr_local (void)
   struct S *pt = &t;
   int j = 4;
 
-  __builtin_prefetch (&t, 0, 0);
-  __builtin_prefetch (pt, 0, 0);
-  __builtin_prefetch (&t.b, 0, 0);
-  __builtin_prefetch (&pt->b, 0, 0);
-  __builtin_prefetch (&t.d, 0, 0);
-  __builtin_prefetch (&pt->d, 0, 0);
-  __builtin_prefetch (t.next, 0, 0);
-  __builtin_prefetch (pt->next, 0, 0);
-  __builtin_prefetch (t.next->d, 0, 0);
-  __builtin_prefetch (pt->next->d, 0, 0);
-
-  __builtin_prefetch (&b, 0, 0);
-  __builtin_prefetch (pb, 0, 0);
-  __builtin_prefetch (&b[2], 0, 0);
-  __builtin_prefetch (&pb[3], 0, 0);
-  __builtin_prefetch (b+3, 0, 0);
-  __builtin_prefetch (b+j, 0, 0);
-  __builtin_prefetch (pb+5, 0, 0);
-  __builtin_prefetch (pb+j, 0, 0);
+  __builtin_prefetch (&t, 0, 0, 1);
+  __builtin_prefetch (pt, 0, 0, 1);
+  __builtin_prefetch (&t.b, 0, 0, 1);
+  __builtin_prefetch (&pt->b, 0, 0, 1);
+  __builtin_prefetch (&t.d, 0, 0, 1);
+  __builtin_prefetch (&pt->d, 0, 0, 1);
+  __builtin_prefetch (t.next, 0, 0, 1);
+  __builtin_prefetch (pt->next, 0, 0, 1);
+  __builtin_prefetch (t.next->d, 0, 0, 1);
+  __builtin_prefetch (pt->next->d, 0, 0, 1);
+
+  __builtin_prefetch (&b, 0, 0, 1);
+  __builtin_prefetch (pb, 0, 0, 1);
+  __builtin_prefetch (&b[2], 0, 0, 1);
+  __builtin_prefetch (&pb[3], 0, 0, 1);
+  __builtin_prefetch (b+3, 0, 0, 1);
+  __builtin_prefetch (b+j, 0, 0, 1);
+  __builtin_prefetch (pb+5, 0, 0, 1);
+  __builtin_prefetch (pb+j, 0, 0, 1);
 }
 
 int
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
index 2e2e808c172..38ce410384a 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
@@ -36,11 +36,11 @@ volatile struct S * volatile vol_ptr_vol_str = &vol_str;
 void
 simple_vol_global ()
 {
-  __builtin_prefetch (glob_vol_int_arr, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_int, 0, 0);
-  __builtin_prefetch (glob_ptr_vol_int, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0);
-  __builtin_prefetch (&glob_vol_int, 0, 0);
+  __builtin_prefetch (glob_vol_int_arr, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_int, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (&glob_vol_int, 0, 0, 1);
 }
 
 /* Prefetch volatile static variables using the address of the variable.  */
@@ -48,11 +48,11 @@ simple_vol_global ()
 void
 simple_vol_file ()
 {
-  __builtin_prefetch (stat_vol_int_arr, 0, 0);
-  __builtin_prefetch (stat_vol_ptr_int, 0, 0);
-  __builtin_prefetch (stat_ptr_vol_int, 0, 0);
-  __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0);
-  __builtin_prefetch (&stat_vol_int, 0, 0);
+  __builtin_prefetch (stat_vol_int_arr, 0, 0, 1);
+  __builtin_prefetch (stat_vol_ptr_int, 0, 0, 1);
+  __builtin_prefetch (stat_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (&stat_vol_int, 0, 0, 1);
 }
 
 /* Prefetch using address expressions involving volatile global variables.  */
@@ -60,43 +60,43 @@ simple_vol_file ()
 void
 expr_vol_global (void)
 {
-  __builtin_prefetch (&vol_str, 0, 0);
-  __builtin_prefetch (ptr_vol_str, 0, 0);
-  __builtin_prefetch (vol_ptr_str, 0, 0);
-  __builtin_prefetch (vol_ptr_vol_str, 0, 0);
-  __builtin_prefetch (&vol_str.b, 0, 0);
-  __builtin_prefetch (&ptr_vol_str->b, 0, 0);
-  __builtin_prefetch (&vol_ptr_str->b, 0, 0);
-  __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0);
-  __builtin_prefetch (&vol_str.d, 0, 0);
-  __builtin_prefetch (&vol_ptr_str->d, 0, 0);
-  __builtin_prefetch (&ptr_vol_str->d, 0, 0);
-  __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0);
-  __builtin_prefetch (vol_str.next, 0, 0);
-  __builtin_prefetch (vol_ptr_str->next, 0, 0);
-  __builtin_prefetch (ptr_vol_str->next, 0, 0);
-  __builtin_prefetch (vol_ptr_vol_str->next, 0, 0);
-  __builtin_prefetch (vol_str.next->d, 0, 0);
-  __builtin_prefetch (vol_ptr_str->next->d, 0, 0);
-  __builtin_prefetch (ptr_vol_str->next->d, 0, 0);
-  __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0);
+  __builtin_prefetch (&vol_str, 0, 0, 1);
+  __builtin_prefetch (ptr_vol_str, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_str, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_vol_str, 0, 0, 1);
+  __builtin_prefetch (&vol_str.b, 0, 0, 1);
+  __builtin_prefetch (&ptr_vol_str->b, 0, 0, 1);
+  __builtin_prefetch (&vol_ptr_str->b, 0, 0, 1);
+  __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0, 1);
+  __builtin_prefetch (&vol_str.d, 0, 0, 1);
+  __builtin_prefetch (&vol_ptr_str->d, 0, 0, 1);
+  __builtin_prefetch (&ptr_vol_str->d, 0, 0, 1);
+  __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0, 1);
+  __builtin_prefetch (vol_str.next, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_str->next, 0, 0, 1);
+  __builtin_prefetch (ptr_vol_str->next, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_vol_str->next, 0, 0, 1);
+  __builtin_prefetch (vol_str.next->d, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_str->next->d, 0, 0, 1);
+  __builtin_prefetch (ptr_vol_str->next->d, 0, 0, 1);
+  __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0, 1);
 
-  __builtin_prefetch (&glob_vol_int_arr, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_int, 0, 0);
-  __builtin_prefetch (glob_ptr_vol_int, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0);
-  __builtin_prefetch (&glob_vol_int_arr[2], 0, 0);
-  __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0);
-  __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0);
-  __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0);
-  __builtin_prefetch (glob_vol_int_arr+3, 0, 0);
-  __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_int+5, 0, 0);
-  __builtin_prefetch (glob_ptr_vol_int+5, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0);
-  __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0);
-  __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0);
+  __builtin_prefetch (&glob_vol_int_arr, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_int, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0, 1);
+  __builtin_prefetch (&glob_vol_int_arr[2], 0, 0, 1);
+  __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0, 1);
+  __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0, 1);
+  __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0, 1);
+  __builtin_prefetch (glob_vol_int_arr+3, 0, 0, 1);
+  __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_int+5, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_vol_int+5, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0, 1);
+  __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0, 1);
+  __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0, 1);
 }
 
 int
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
index ade892b21a7..69b4cbe1854 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
@@ -17,7 +17,7 @@ int
 assign_arg_ptr (int *p)
 {
   int *q;
-  __builtin_prefetch ((q = p), 0, 0);
+  __builtin_prefetch ((q = p), 0, 0, 1);
   return q == p;
 }
 
@@ -25,7 +25,7 @@ int
 assign_glob_ptr (void)
 {
   int *q;
-  __builtin_prefetch ((q = ptr), 0, 0);
+  __builtin_prefetch ((q = ptr), 0, 0, 1);
   return q == ptr;
 }
 
@@ -33,7 +33,7 @@ int
 assign_arg_idx (int *p, int i)
 {
   int j;
-  __builtin_prefetch (&p[j = i], 0, 0);
+  __builtin_prefetch (&p[j = i], 0, 0, 1);
   return j == i;
 }
 
@@ -41,7 +41,7 @@ int
 assign_glob_idx (void)
 {
   int j;
-  __builtin_prefetch (&ptr[j = arrindex], 0, 0);
+  __builtin_prefetch (&ptr[j = arrindex], 0, 0, 1);
   return j == arrindex;
 }
 
@@ -53,7 +53,7 @@ preinc_arg_ptr (int *p)
 {
   int *q;
   q = p + 1;
-  __builtin_prefetch (++p, 0, 0);
+  __builtin_prefetch (++p, 0, 0, 1);
   return p == q;
 }
 
@@ -62,7 +62,7 @@ preinc_glob_ptr (void)
 {
   int *q;
   q = ptr + 1;
-  __builtin_prefetch (++ptr, 0, 0);
+  __builtin_prefetch (++ptr, 0, 0, 1);
   return ptr == q;
 }
 
@@ -71,7 +71,7 @@ postinc_arg_ptr (int *p)
 {
   int *q;
   q = p + 1;
-  __builtin_prefetch (p++, 0, 0);
+  __builtin_prefetch (p++, 0, 0, 1);
   return p == q;
 }
 
@@ -80,7 +80,7 @@ postinc_glob_ptr (void)
 {
   int *q;
   q = ptr + 1;
-  __builtin_prefetch (ptr++, 0, 0);
+  __builtin_prefetch (ptr++, 0, 0, 1);
   return ptr == q;
 }
 
@@ -89,7 +89,7 @@ predec_arg_ptr (int *p)
 {
   int *q;
   q = p - 1;
-  __builtin_prefetch (--p, 0, 0);
+  __builtin_prefetch (--p, 0, 0, 1);
   return p == q;
 }
 
@@ -98,7 +98,7 @@ predec_glob_ptr (void)
 {
   int *q;
   q = ptr - 1;
-  __builtin_prefetch (--ptr, 0, 0);
+  __builtin_prefetch (--ptr, 0, 0, 1);
   return ptr == q;
 }
 
@@ -107,7 +107,7 @@ postdec_arg_ptr (int *p)
 {
   int *q;
   q = p - 1;
-  __builtin_prefetch (p--, 0, 0);
+  __builtin_prefetch (p--, 0, 0, 1);
   return p == q;
 }
 
@@ -116,7 +116,7 @@ postdec_glob_ptr (void)
 {
   int *q;
   q = ptr - 1;
-  __builtin_prefetch (ptr--, 0, 0);
+  __builtin_prefetch (ptr--, 0, 0, 1);
   return ptr == q;
 }
 
@@ -124,7 +124,7 @@ int
 preinc_arg_idx (int *p, int i)
 {
   int j = i + 1;
-  __builtin_prefetch (&p[++i], 0, 0);
+  __builtin_prefetch (&p[++i], 0, 0, 1);
   return i == j;
 }
 
@@ -133,7 +133,7 @@ int
 preinc_glob_idx (void)
 {
   int j = arrindex + 1;
-  __builtin_prefetch (&ptr[++arrindex], 0, 0);
+  __builtin_prefetch (&ptr[++arrindex], 0, 0, 1);
   return arrindex == j;
 }
 
@@ -141,7 +141,7 @@ int
 postinc_arg_idx (int *p, int i)
 {
   int j = i + 1;
-  __builtin_prefetch (&p[i++], 0, 0);
+  __builtin_prefetch (&p[i++], 0, 0, 1);
   return i == j;
 }
 
@@ -149,7 +149,7 @@ int
 postinc_glob_idx (void)
 {
   int j = arrindex + 1;
-  __builtin_prefetch (&ptr[arrindex++], 0, 0);
+  __builtin_prefetch (&ptr[arrindex++], 0, 0, 1);
   return arrindex == j;
 }
 
@@ -157,7 +157,7 @@ int
 predec_arg_idx (int *p, int i)
 {
   int j = i - 1;
-  __builtin_prefetch (&p[--i], 0, 0);
+  __builtin_prefetch (&p[--i], 0, 0, 1);
   return i == j;
 }
 
@@ -165,7 +165,7 @@ int
 predec_glob_idx (void)
 {
   int j = arrindex - 1;
-  __builtin_prefetch (&ptr[--arrindex], 0, 0);
+  __builtin_prefetch (&ptr[--arrindex], 0, 0, 1);
   return arrindex == j;
 }
 
@@ -173,7 +173,7 @@ int
 postdec_arg_idx (int *p, int i)
 {
   int j = i - 1;
-  __builtin_prefetch (&p[i--], 0, 0);
+  __builtin_prefetch (&p[i--], 0, 0, 1);
   return i == j;
 }
 
@@ -181,7 +181,7 @@ int
 postdec_glob_idx (void)
 {
   int j = arrindex - 1;
-  __builtin_prefetch (&ptr[arrindex--], 0, 0);
+  __builtin_prefetch (&ptr[arrindex--], 0, 0, 1);
   return arrindex == j;
 }
 
@@ -200,7 +200,7 @@ getptr (int *p)
 int
 funccall_arg_ptr (int *p)
 {
-  __builtin_prefetch (getptr (p), 0, 0);
+  __builtin_prefetch (getptr (p), 0, 0, 1);
   return getptrcnt == 1;
 }
 
@@ -216,7 +216,7 @@ getint (int i)
 int
 funccall_arg_idx (int *p, int i)
 {
-  __builtin_prefetch (&p[getint (i)], 0, 0);
+  __builtin_prefetch (&p[getint (i)], 0, 0, 1);
   return getintcnt == 1;
 }
 
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
index f42a2c0ca87..a6fa1741888 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
@@ -18,32 +18,32 @@ int idx = 3;
 void
 arg_ptr (char *p)
 {
-  __builtin_prefetch (p, 0, 0);
+  __builtin_prefetch (p, 0, 0, 1);
 }
 
 void
 arg_idx (char *p, int i)
 {
-  __builtin_prefetch (&p[i], 0, 0);
+  __builtin_prefetch (&p[i], 0, 0, 1);
 }
 
 void
 glob_ptr (void)
 {
-  __builtin_prefetch (ptr, 0, 0);
+  __builtin_prefetch (ptr, 0, 0, 1);
 }
 
 void
 glob_idx (void)
 {
-  __builtin_prefetch (&ptr[idx], 0, 0);
+  __builtin_prefetch (&ptr[idx], 0, 0, 1);
 }
 
 int
 main ()
 {
-  __builtin_prefetch (&s.b, 0, 0);
-  __builtin_prefetch (&s.c[1], 0, 0);
+  __builtin_prefetch (&s.b, 0, 0, 1);
+  __builtin_prefetch (&s.c[1], 0, 0, 1);
 
   arg_ptr (&s.c[1]);
   arg_ptr (ptr+3);
diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
index f643c5c7286..fabecaf56dc 100644
--- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
+++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
@@ -25,7 +25,7 @@ prefetch_for_read (void)
 {
   int i;
   for (i = 0; i < ARRSIZE; i++)
-    __builtin_prefetch (bad_addr[i], 0, 0);
+    __builtin_prefetch (bad_addr[i], 0, 0, 1);
 }
 
 void
@@ -33,7 +33,7 @@ prefetch_for_write (void)
 {
   int i;
   for (i = 0; i < ARRSIZE; i++)
-    __builtin_prefetch (bad_addr[i], 1, 0);
+    __builtin_prefetch (bad_addr[i], 1, 0, 1);
 }
 
 int
diff --git a/gcc/testsuite/gcc.dg/builtin-prefetch-1.c b/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
index 11beb4e1bbe..84d564dc72c 100644
--- a/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
+++ b/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
@@ -1,6 +1,6 @@
 /* Test that __builtin_prefetch does no harm.
 
-   Prefetch using some invalid rw and locality values.  These must be
+   Prefetch using some invalid cache, rw and locality values.  These must be
    compile-time constants.  */
 
 /* { dg-do run } */
@@ -9,6 +9,7 @@ extern void exit (int);
 
 enum locality { none, low, moderate, high, bogus };
 enum rw { read, write };
+enum cache { inst, data };
 
 int arr[10];
 
@@ -34,6 +35,8 @@ bad (int *p)
   __builtin_prefetch (p, 0, -1);  /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
   __builtin_prefetch (p, 0, 4);   /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
   __builtin_prefetch (p, 0, bogus);   /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
+  __builtin_prefetch (p, 0, 3, -1);   /* { dg-warning "invalid fourth argument to '__builtin_prefetch'; using one" } */
+  __builtin_prefetch (p, 0, 3, bogus);   /* { dg-warning "invalid fourth argument to '__builtin_prefetch'; using one" } */
 }
 
 int
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
index 638749a5a68..eb9197b357c 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
@@ -9,14 +9,14 @@ char *msg = "howdy there";
 
 void foo (char *p)
 {
-  __builtin_prefetch (p, 0, 0);
-  __builtin_prefetch (p, 0, 1);
-  __builtin_prefetch (p, 0, 2);
-  __builtin_prefetch (p, 0, 3);
-  __builtin_prefetch (p, 1, 0);
-  __builtin_prefetch (p, 1, 1);
-  __builtin_prefetch (p, 1, 2);
-  __builtin_prefetch (p, 1, 3);
+  __builtin_prefetch (p, 0, 0, 1);
+  __builtin_prefetch (p, 0, 1, 1);
+  __builtin_prefetch (p, 0, 2, 1);
+  __builtin_prefetch (p, 0, 3, 1);
+  __builtin_prefetch (p, 1, 0, 1);
+  __builtin_prefetch (p, 1, 1, 1);
+  __builtin_prefetch (p, 1, 2, 1);
+  __builtin_prefetch (p, 1, 3, 1);
 }
 
 int main ()
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
index d793437f175..b5081815f7a 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
@@ -10,14 +10,14 @@ char *msg = "howdy there";
 
 void foo (char *p)
 {
-  __builtin_prefetch (p, 0, 0);
-  __builtin_prefetch (p, 0, 1);
-  __builtin_prefetch (p, 0, 2);
-  __builtin_prefetch (p, 0, 3);
-  __builtin_prefetch (p, 1, 0);
-  __builtin_prefetch (p, 1, 1);
-  __builtin_prefetch (p, 1, 2);
-  __builtin_prefetch (p, 1, 3);
+  __builtin_prefetch (p, 0, 0, 1);
+  __builtin_prefetch (p, 0, 1, 1);
+  __builtin_prefetch (p, 0, 2, 1);
+  __builtin_prefetch (p, 0, 3, 1);
+  __builtin_prefetch (p, 1, 0, 1);
+  __builtin_prefetch (p, 1, 1, 1);
+  __builtin_prefetch (p, 1, 2, 1);
+  __builtin_prefetch (p, 1, 3, 1);
 }
 
 int main ()
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
index 04e814d5a9c..2317f665107 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
@@ -9,14 +9,14 @@ char *msg = "howdy there";
 
 void foo (char *p)
 {
-  __builtin_prefetch (p, 0, 0);
-  __builtin_prefetch (p, 0, 1);
-  __builtin_prefetch (p, 0, 2);
-  __builtin_prefetch (p, 0, 3);
-  __builtin_prefetch (p, 1, 0);
-  __builtin_prefetch (p, 1, 1);
-  __builtin_prefetch (p, 1, 2);
-  __builtin_prefetch (p, 1, 3);
+  __builtin_prefetch (p, 0, 0, 1);
+  __builtin_prefetch (p, 0, 1, 1);
+  __builtin_prefetch (p, 0, 2, 1);
+  __builtin_prefetch (p, 0, 3, 1);
+  __builtin_prefetch (p, 1, 0, 1);
+  __builtin_prefetch (p, 1, 1, 1);
+  __builtin_prefetch (p, 1, 2, 1);
+  __builtin_prefetch (p, 1, 3, 1);
 }
 
 int main ()
diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
index 3707c7074be..936ad9e79ad 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
+++ b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
@@ -9,14 +9,14 @@ char *msg = "howdy there";
 
 void foo (char *p)
 {
-  __builtin_prefetch (p, 0, 0);
-  __builtin_prefetch (p, 0, 1);
-  __builtin_prefetch (p, 0, 2);
-  __builtin_prefetch (p, 0, 3);
-  __builtin_prefetch (p, 1, 0);
-  __builtin_prefetch (p, 1, 1);
-  __builtin_prefetch (p, 1, 2);
-  __builtin_prefetch (p, 1, 3);
+  __builtin_prefetch (p, 0, 0, 1);
+  __builtin_prefetch (p, 0, 1, 1);
+  __builtin_prefetch (p, 0, 2, 1);
+  __builtin_prefetch (p, 0, 3, 1);
+  __builtin_prefetch (p, 1, 0, 1);
+  __builtin_prefetch (p, 1, 1, 1);
+  __builtin_prefetch (p, 1, 2, 1);
+  __builtin_prefetch (p, 1, 3, 1);
 }
 
 int main ()
diff --git a/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c b/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
new file mode 100644
index 00000000000..104d6fe59d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Remind users that instruction prefetch is not yet implemented.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not yet implemented; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not yet implemented; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/alpha/prefetchi-1.c b/gcc/testsuite/gcc.target/alpha/prefetchi-1.c
new file mode 100644
index 00000000000..5d9c387e260
--- /dev/null
+++ b/gcc/testsuite/gcc.target/alpha/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=ev6" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/arc/prefetchi-1.c b/gcc/testsuite/gcc.target/arc/prefetchi-1.c
new file mode 100644
index 00000000000..7e023ab6498
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=archs" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/arm/prefetchi-1.c b/gcc/testsuite/gcc.target/arm/prefetchi-1.c
new file mode 100644
index 00000000000..c2c48febffc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -march=armv5te" } */
+
+/* Remind users that instruction prefetch is not yet implemented.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not yet implemented; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not yet implemented; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/hppa/prefetchi-1.c b/gcc/testsuite/gcc.target/hppa/prefetchi-1.c
new file mode 100644
index 00000000000..26854a6828d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/hppa/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mpa-risc-2-0" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
index 051a1b59b5b..ea0b9f6bcef 100644
--- a/gcc/testsuite/gcc.target/i386/avx-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx-1.c
@@ -153,7 +153,7 @@
 #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
 
 /* xmmintrin.h */
-#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
+#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
 #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
 #define __builtin_ia32_vec_set_v4hi(A, D, N) \
   __builtin_ia32_vec_set_v4hi(A, D, 0)
diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-1.c b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
new file mode 100644
index 00000000000..b32d59f2e5f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad(const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
index ca662f7bd47..6c9742cf494 100644
--- a/gcc/testsuite/gcc.target/i386/sse-13.c
+++ b/gcc/testsuite/gcc.target/i386/sse-13.c
@@ -125,7 +125,7 @@
 #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
 
 /* xmmintrin.h */
-#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
+#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
 #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
 #define __builtin_ia32_vec_set_v4hi(A, D, N) \
   __builtin_ia32_vec_set_v4hi(A, D, 0)
diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
index ba1310f9f89..344913e9a90 100644
--- a/gcc/testsuite/gcc.target/i386/sse-23.c
+++ b/gcc/testsuite/gcc.target/i386/sse-23.c
@@ -94,7 +94,7 @@
 #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
 
 /* xmmintrin.h */
-#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
+#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
 #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
 #define __builtin_ia32_vec_set_v4hi(A, D, N) \
   __builtin_ia32_vec_set_v4hi(A, D, 0)
diff --git a/gcc/testsuite/gcc.target/ia64/prefetchi-1.c b/gcc/testsuite/gcc.target/ia64/prefetchi-1.c
new file mode 100644
index 00000000000..f082396ac2e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/ia64/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/mips/prefetchi-1.c b/gcc/testsuite/gcc.target/mips/prefetchi-1.c
new file mode 100644
index 00000000000..23e78a0c7ba
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-mips4 -mexplicit-relocs" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c b/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
new file mode 100644
index 00000000000..f082396ac2e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/s390/prefetchi-1.c b/gcc/testsuite/gcc.target/s390/prefetchi-1.c
new file mode 100644
index 00000000000..5ef557f1d8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mzarch -march=z10" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/sh/prefetchi-1.c b/gcc/testsuite/gcc.target/sh/prefetchi-1.c
new file mode 100644
index 00000000000..347bdea8df8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sh/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { has_pref } } } */
+/* { dg-options "-O2" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
diff --git a/gcc/testsuite/gcc.target/sparc/prefetchi-1.c b/gcc/testsuite/gcc.target/sparc/prefetchi-1.c
new file mode 100644
index 00000000000..1bd7ad495e2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/prefetchi-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=v9" } */
+
+/* Remind users that instruction prefetch is not supported yet.  */
+
+void
+bad (const int* p)
+{
+  __builtin_prefetch(p, 0, 3, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+  __builtin_prefetch(p, 0, 2, 0);	/* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
+}
-- 
2.18.1


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-14  8:34 ` [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM Haochen Jiang
  2022-10-14  8:46   ` Hongtao Liu
  2022-10-17 15:28   ` Richard Earnshaw
@ 2022-10-19 17:14   ` Andrew Pinski
  2022-10-19 21:14     ` Segher Boessenkool
  2022-10-19 21:06   ` Segher Boessenkool
  3 siblings, 1 reply; 26+ messages in thread
From: Andrew Pinski @ 2022-10-19 17:14 UTC (permalink / raw)
  To: Haochen Jiang
  Cc: gcc-patches, aoliva, richard.sandiford, uweigand, linkw, gnu,
	dje.gcc, olegendo, claziss, segher, mfortune, davem, dave.anglin,
	hubicka, richard.earnshaw, rguenther, marcus.shawcroft,
	ramana.radhakrishnan, hongtao.liu

On Fri, Oct 14, 2022 at 1:40 AM Haochen Jiang via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> gcc/ChangeLog:
>
>         * builtins.cc (expand_builtin_prefetch): Handle the fourth parameter in
>         expand function.
>         * config/aarch64/aarch64-sve.md: Add default parameter value.
>         * config/aarch64/aarch64.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/alpha/alpha.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/arc/arc.md: Add default parameter value.
>         * config/arm/arm.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/frv/frv.md: Ditto.
>         * config/i386/i386.md: Ditto.
>         * config/ia64/ia64.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/mips/mips.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/pa/pa.md: Ditto.
>         * config/rs6000/rs6000.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/s390/s390.cc (s390_expand_cpymem): Generate fourth parameter for
>         gen_prefetch call.
>         (s390_expand_setmem): Ditto.
>         (s390_expand_cmpmem): Ditto.
>         * config/s390/s390.md (prefetch): New define_expand.
>         (*prefetch): Add default parameter value.
>         * config/sh/sh.md: Ditto.
>         * config/sparc/sparc.md: Ditto.
>         * doc/rtl.texi: Document cache variable for prefetch.
>         * rtl.def (PREFETCH): Change prefetch DEF_RTL_EXPR to add fourth parameter.
>         * rtlanal.cc (setup_reg_subrtx_bounds): Change gcc_checking_assert for
>         fourth parameter.
>         * target-insns.def (prefetch): Add fourth rtx for prefetch.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.c-torture/execute/builtin-prefetch-1.c: Add fourth parameter for
>         testcases.
>         * gcc.c-torture/execute/builtin-prefetch-2.c: Ditto.
>         * gcc.c-torture/execute/builtin-prefetch-3.c: Ditto.
>         * gcc.c-torture/execute/builtin-prefetch-4.c: Ditto.
>         * gcc.c-torture/execute/builtin-prefetch-5.c: Ditto.
>         * gcc.c-torture/execute/builtin-prefetch-6.c: Ditto.
>         * gcc.dg/builtin-prefetch-1.c: Ditto.
>         * gcc.misc-tests/i386-pf-3dnow-1.c: Ditto.
>         * gcc.misc-tests/i386-pf-athlon-1.c: Ditto.
>         * gcc.misc-tests/i386-pf-none-1.c: Ditto.
>         * gcc.misc-tests/i386-pf-sse-1.c: Ditto.
>         * gcc.target/i386/avx-1.c: Change prefetch macro define to variable args.
>         * gcc.target/i386/sse-13.c: Ditto.
>         * gcc.target/i386/sse-23.c: Ditto.
>         * gcc.target/aarch64/prefetchi-1.c: New test.
>         * gcc.target/alpha/prefetchi-1.c: Ditto.
>         * gcc.target/arc/prefetchi-1.c: Ditto.
>         * gcc.target/arm/prefetchi-1.c: Ditto.
>         * gcc.target/hppa/prefetchi-1.c: Ditto.
>         * gcc.target/i386/prefetchi-1.c: Ditto.
>         * gcc.target/ia64/prefetchi-1.c: Ditto.
>         * gcc.target/mips/prefetchi-1.c: Ditto.
>         * gcc.target/powerpc/prefetchi-1.c: Ditto.
>         * gcc.target/s390/prefetchi-1.c: Ditto.
>         * gcc.target/sh/prefetchi-1.c: Ditto.
>         * gcc.target/sparc/prefetchi-1.c: Ditto.


Do the testcases really need to be changed rather than adding new testcases?
Usually it is better if the testcases not change unless really needed
to be. That is do these testcases pass without being changed? If not
this seems not backwards compatible change and is not something which
we should do.  Otherwise you should just add new testcases instead.

Thanks,
Andrew

> ---
>  gcc/builtins.cc                               |  34 ++++--
>  gcc/config/aarch64/aarch64-sve.md             |  15 ++-
>  gcc/config/aarch64/aarch64.md                 |  19 +++-
>  gcc/config/alpha/alpha.md                     |  19 +++-
>  gcc/config/arc/arc.md                         |  20 +++-
>  gcc/config/arm/arm.md                         |  19 +++-
>  gcc/config/frv/frv.md                         |   6 +-
>  gcc/config/i386/i386.md                       |  17 ++-
>  gcc/config/ia64/ia64.md                       |  19 +++-
>  gcc/config/mips/mips.md                       |  22 +++-
>  gcc/config/pa/pa.md                           |  12 +-
>  gcc/config/rs6000/rs6000.md                   |  19 +++-
>  gcc/config/s390/s390.cc                       |  10 +-
>  gcc/config/s390/s390.md                       |  19 +++-
>  gcc/config/sh/sh.md                           |  15 ++-
>  gcc/config/sparc/sparc.md                     |  15 ++-
>  gcc/doc/rtl.texi                              |   6 +-
>  gcc/rtl.def                                   |   5 +-
>  gcc/rtlanal.cc                                |   2 +-
>  gcc/target-insns.def                          |   2 +-
>  .../execute/builtin-prefetch-1.c              |  45 ++++----
>  .../execute/builtin-prefetch-2.c              | 106 +++++++++---------
>  .../execute/builtin-prefetch-3.c              |  92 +++++++--------
>  .../execute/builtin-prefetch-4.c              |  44 ++++----
>  .../execute/builtin-prefetch-5.c              |  12 +-
>  .../execute/builtin-prefetch-6.c              |   4 +-
>  gcc/testsuite/gcc.dg/builtin-prefetch-1.c     |   5 +-
>  .../gcc.misc-tests/i386-pf-3dnow-1.c          |  16 +--
>  .../gcc.misc-tests/i386-pf-athlon-1.c         |  16 +--
>  gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c |  16 +--
>  gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c  |  16 +--
>  .../gcc.target/aarch64/prefetchi-1.c          |  11 ++
>  gcc/testsuite/gcc.target/alpha/prefetchi-1.c  |  11 ++
>  gcc/testsuite/gcc.target/arc/prefetchi-1.c    |  11 ++
>  gcc/testsuite/gcc.target/arm/prefetchi-1.c    |  11 ++
>  gcc/testsuite/gcc.target/hppa/prefetchi-1.c   |  11 ++
>  gcc/testsuite/gcc.target/i386/avx-1.c         |   2 +-
>  gcc/testsuite/gcc.target/i386/prefetchi-1.c   |  11 ++
>  gcc/testsuite/gcc.target/i386/sse-13.c        |   2 +-
>  gcc/testsuite/gcc.target/i386/sse-23.c        |   2 +-
>  gcc/testsuite/gcc.target/ia64/prefetchi-1.c   |  11 ++
>  gcc/testsuite/gcc.target/mips/prefetchi-1.c   |  11 ++
>  .../gcc.target/powerpc/prefetchi-1.c          |  11 ++
>  gcc/testsuite/gcc.target/s390/prefetchi-1.c   |  11 ++
>  gcc/testsuite/gcc.target/sh/prefetchi-1.c     |  11 ++
>  gcc/testsuite/gcc.target/sparc/prefetchi-1.c  |  11 ++
>  46 files changed, 564 insertions(+), 241 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/alpha/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arc/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/hppa/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/ia64/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/mips/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/sh/prefetchi-1.c
>  create mode 100644 gcc/testsuite/gcc.target/sparc/prefetchi-1.c
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index 5f319b28030..2e6d0c76beb 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -1282,18 +1282,18 @@ expand_builtin_update_setjmp_buf (rtx buf_addr)
>  static void
>  expand_builtin_prefetch (tree exp)
>  {
> -  tree arg0, arg1, arg2;
> +  tree arg0, arg1, arg2, arg3;
>    int nargs;
> -  rtx op0, op1, op2;
> +  rtx op0, op1, op2, op3;
>
>    if (!validate_arglist (exp, POINTER_TYPE, 0))
>      return;
>
>    arg0 = CALL_EXPR_ARG (exp, 0);
>
> -  /* Arguments 1 and 2 are optional; argument 1 (read/write) defaults to
> -     zero (read) and argument 2 (locality) defaults to 3 (high degree of
> -     locality).  */
> +  /* Arguments 1, 2, 3 are optional; argument 1 (read/write) defaults to
> +     zero (read); argument 2 (locality) defaults to 3 (high degree of
> +     locality); argument 3 (cache type) defaults to 1 (data).  */
>    nargs = call_expr_nargs (exp);
>    if (nargs > 1)
>      arg1 = CALL_EXPR_ARG (exp, 1);
> @@ -1303,6 +1303,10 @@ expand_builtin_prefetch (tree exp)
>      arg2 = CALL_EXPR_ARG (exp, 2);
>    else
>      arg2 = integer_three_node;
> +  if (nargs > 3)
> +    arg3 = CALL_EXPR_ARG (exp, 3);
> +  else
> +    arg3 = integer_one_node;
>
>    /* Argument 0 is an address.  */
>    op0 = expand_expr (arg0, NULL_RTX, Pmode, EXPAND_NORMAL);
> @@ -1336,14 +1340,30 @@ expand_builtin_prefetch (tree exp)
>        op2 = const0_rtx;
>      }
>
> +  /* Argument 3 (cache type) must be a compile-time constant int.  */
> +  if (TREE_CODE (arg3) != INTEGER_CST)
> +    {
> +      error ("fourth argument to %<__builtin_prefetch%> must be a constant");
> +      arg3 = integer_one_node;
> +    }
> +  op3 = expand_normal (arg3);
> +  /* Argument 3 must be either zero or one.  */
> +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> +    {
> +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> +       " using one");
> +      op3 = const1_rtx;
> +    }
> +
>    if (targetm.have_prefetch ())
>      {
> -      class expand_operand ops[3];
> +      class expand_operand ops[4];
>
>        create_address_operand (&ops[0], op0);
>        create_integer_operand (&ops[1], INTVAL (op1));
>        create_integer_operand (&ops[2], INTVAL (op2));
> -      if (maybe_expand_insn (targetm.code_for_prefetch, 3, ops))
> +      create_integer_operand (&ops[3], INTVAL (op3));
> +      if (maybe_expand_insn (targetm.code_for_prefetch, 4, ops))
>         return;
>      }
>
> diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md
> index e08bee197d8..0cde862bc04 100644
> --- a/gcc/config/aarch64/aarch64-sve.md
> +++ b/gcc/config/aarch64/aarch64-sve.md
> @@ -1944,7 +1944,8 @@
>                 (match_operand:DI 2 "const_int_operand")]
>                UNSPEC_SVE_PREFETCH)
>              (match_operand:DI 3 "const_int_operand")
> -            (match_operand:DI 4 "const_int_operand"))]
> +            (match_operand:DI 4 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_SVE"
>    {
>      operands[1] = gen_rtx_MEM (<MODE>mode, operands[1]);
> @@ -1984,7 +1985,8 @@
>                 (match_operand:DI 6 "const_int_operand")]
>                UNSPEC_SVE_PREFETCH_GATHER)
>              (match_operand:DI 7 "const_int_operand")
> -            (match_operand:DI 8 "const_int_operand"))]
> +            (match_operand:DI 8 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_SVE"
>    {
>      static const char *const insns[][2] = {
> @@ -2013,7 +2015,8 @@
>                 (match_operand:DI 6 "const_int_operand")]
>                UNSPEC_SVE_PREFETCH_GATHER)
>              (match_operand:DI 7 "const_int_operand")
> -            (match_operand:DI 8 "const_int_operand"))]
> +            (match_operand:DI 8 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_SVE"
>    {
>      static const char *const insns[][2] = {
> @@ -2044,7 +2047,8 @@
>                 (match_operand:DI 6 "const_int_operand")]
>                UNSPEC_SVE_PREFETCH_GATHER)
>              (match_operand:DI 7 "const_int_operand")
> -            (match_operand:DI 8 "const_int_operand"))]
> +            (match_operand:DI 8 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_SVE"
>    {
>      static const char *const insns[][2] = {
> @@ -2074,7 +2078,8 @@
>                 (match_operand:DI 6 "const_int_operand")]
>                UNSPEC_SVE_PREFETCH_GATHER)
>              (match_operand:DI 7 "const_int_operand")
> -            (match_operand:DI 8 "const_int_operand"))]
> +            (match_operand:DI 8 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_SVE"
>    {
>      static const char *const insns[][2] = {
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index f2e3d905dbb..94fa6b4200c 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -818,10 +818,25 @@
>    [(set_attr "type" "no_insn")]
>  )
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:DI 0 "aarch64_prefetch_operand")
> +            (match_operand:QI 1 "const_int_operand")
> +            (match_operand:QI 2 "const_int_operand")
> +           (match_operand:QI 3 "const_int_operand"))]
> +  ""
> +  {
> +    if (INTVAL (operands[3]) == 0)
> +    {
> +      warning (0, "instruction prefetch is not supported; using data prefetch");
> +      operands[3] = const1_rtx;
> +    }
> +  })
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand:DI 0 "aarch64_prefetch_operand" "Dp")
>              (match_operand:QI 1 "const_int_operand" "")
> -            (match_operand:QI 2 "const_int_operand" ""))]
> +            (match_operand:QI 2 "const_int_operand" "")
> +           (const_int 1))]
>    ""
>    {
>      const char * pftype[2][4] =
> diff --git a/gcc/config/alpha/alpha.md b/gcc/config/alpha/alpha.md
> index 87514330c22..46fd6a7b7cb 100644
> --- a/gcc/config/alpha/alpha.md
> +++ b/gcc/config/alpha/alpha.md
> @@ -5176,10 +5176,25 @@
>  ;;
>  ;; On EV6, these become official prefetch instructions.
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:DI 0 "address_operand")
> +            (match_operand:DI 1 "const_int_operand")
> +            (match_operand:DI 2 "const_int_operand")
> +            (match_operand:DI 3 "const_int_operand"))]
> +  "TARGET_FIXUP_EV5_PREFETCH || alpha_cpu == PROCESSOR_EV6"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand:DI 0 "address_operand" "p")
>              (match_operand:DI 1 "const_int_operand" "n")
> -            (match_operand:DI 2 "const_int_operand" "n"))]
> +            (match_operand:DI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "TARGET_FIXUP_EV5_PREFETCH || alpha_cpu == PROCESSOR_EV6"
>  {
>    /* Interpret "no temporal locality" as this data should be evicted once
> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
> index 458d3edf716..9607a0dd572 100644
> --- a/gcc/config/arc/arc.md
> +++ b/gcc/config/arc/arc.md
> @@ -5255,14 +5255,22 @@ archs4x, archs4xd"
>  (define_expand "prefetch"
>    [(prefetch (match_operand:SI 0 "address_operand" "")
>              (match_operand:SI 1 "const_int_operand" "")
> -            (match_operand:SI 2 "const_int_operand" ""))]
> +            (match_operand:SI 2 "const_int_operand" "")
> +            (match_operand:SI 3 "const_int_operand" ""))]
>    "TARGET_HS"
> -  "")
> +  {
> +    if (INTVAL (operands[3]) == 0)
> +    {
> +      warning (0, "instruction prefetch is not supported; using data prefetch");
> +      operands[3] = const1_rtx;
> +    }
> +  })
>
>  (define_insn "prefetch_1"
>    [(prefetch (match_operand:SI 0 "register_operand" "r")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "TARGET_HS"
>    {
>     if (INTVAL (operands[1]))
> @@ -5277,7 +5285,8 @@ archs4x, archs4xd"
>    [(prefetch (plus:SI (match_operand:SI 0 "register_operand" "r,r,r")
>                       (match_operand:SI 1 "nonmemory_operand" "r,Cm2,Cal"))
>              (match_operand:SI 2 "const_int_operand" "n,n,n")
> -            (match_operand:SI 3 "const_int_operand" "n,n,n"))]
> +            (match_operand:SI 3 "const_int_operand" "n,n,n")
> +            (const_int 1))]
>    "TARGET_HS"
>    {
>     if (INTVAL (operands[2]))
> @@ -5291,7 +5300,8 @@ archs4x, archs4xd"
>  (define_insn "prefetch_3"
>    [(prefetch (match_operand:SI 0 "address_operand" "p")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "TARGET_HS"
>    {
>     operands[0] = gen_rtx_MEM (SImode, operands[0]);
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 69bf343fb0e..7f2ec97406f 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -12206,10 +12206,25 @@
>
>  ;; V5E instructions.
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:SI 0 "address_operand")
> +            (match_operand:SI 1 "")
> +            (match_operand:SI 2 "")
> +            (match_operand:SI 3 ""))]
> +  "TARGET_32BIT && arm_arch5te"
> +  {
> +    if (INTVAL (operands[3]) == 0)
> +    {
> +      warning (0, "instruction prefetch is not supported; using data prefetch");
> +      operands[3] = const1_rtx;
> +    }
> +  })
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand:SI 0 "address_operand" "p")
>              (match_operand:SI 1 "" "")
> -            (match_operand:SI 2 "" ""))]
> +            (match_operand:SI 2 "" "")
> +            (const_int 1))]
>    "TARGET_32BIT && arm_arch5te"
>    "pld\\t%a0"
>    [(set_attr "type" "load_4")]
> diff --git a/gcc/config/frv/frv.md b/gcc/config/frv/frv.md
> index 6258fe3b99e..2fb9de593c9 100644
> --- a/gcc/config/frv/frv.md
> +++ b/gcc/config/frv/frv.md
> @@ -7631,7 +7631,8 @@
>    [(prefetch (unspec:SI [(match_operand:SI 0 "register_operand" "r")]
>                         UNSPEC_PREFETCH0)
>              (const_int 0)
> -            (const_int 0))]
> +            (const_int 0)
> +            (const_int 1))]
>    ""
>    "dcpl %0, gr0, #0"
>    [(set_attr "length" "4")])
> @@ -7640,7 +7641,8 @@
>    [(prefetch (unspec:SI [(match_operand:SI 0 "register_operand" "r")]
>                         UNSPEC_PREFETCH)
>              (const_int 0)
> -            (const_int 0))]
> +            (const_int 0)
> +            (const_int 1))]
>    "TARGET_FR500_FR550_BUILTINS"
>    "nop.p\\n\\tnldub @(%0, gr0), gr0"
>    [(set_attr "length" "8")])
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 8e847520491..c65cf14b9f4 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -23635,9 +23635,15 @@
>  (define_expand "prefetch"
>    [(prefetch (match_operand 0 "address_operand")
>              (match_operand:SI 1 "const_int_operand")
> -            (match_operand:SI 2 "const_int_operand"))]
> +            (match_operand:SI 2 "const_int_operand")
> +            (match_operand:SI 3 "const_int_operand"))]
>    "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1"
>  {
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
>    bool write = operands[1] != const0_rtx;
>    int locality = INTVAL (operands[2]);
>
> @@ -23679,7 +23685,8 @@
>  (define_insn "*prefetch_sse"
>    [(prefetch (match_operand 0 "address_operand" "p")
>              (const_int 0)
> -            (match_operand:SI 1 "const_int_operand"))]
> +            (match_operand:SI 1 "const_int_operand")
> +            (const_int 1))]
>    "TARGET_PREFETCH_SSE"
>  {
>    static const char * const patterns[4] = {
> @@ -23700,7 +23707,8 @@
>  (define_insn "*prefetch_3dnow"
>    [(prefetch (match_operand 0 "address_operand" "p")
>              (match_operand:SI 1 "const_int_operand")
> -            (const_int 3))]
> +            (const_int 3)
> +            (const_int 1))]
>    "TARGET_3DNOW || TARGET_PRFCHW || TARGET_PREFETCHWT1"
>  {
>    if (operands[1] == const0_rtx)
> @@ -23716,7 +23724,8 @@
>  (define_insn "*prefetch_prefetchwt1"
>    [(prefetch (match_operand 0 "address_operand" "p")
>              (const_int 1)
> -            (const_int 2))]
> +            (const_int 2)
> +            (const_int 1))]
>    "TARGET_PREFETCHWT1"
>    "prefetchwt1\t%a0";
>    [(set_attr "type" "sse")
> diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
> index 5d1d47da55b..9fbbea3412a 100644
> --- a/gcc/config/ia64/ia64.md
> +++ b/gcc/config/ia64/ia64.md
> @@ -5018,10 +5018,25 @@
>    "break.f 0"
>    [(set_attr "itanium_class" "nop_f")])
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:DI 0 "address_operand")
> +            (match_operand:DI 1 "const_int_operand")
> +            (match_operand:DI 2 "const_int_operand")
> +            (match_operand:DI 3 "const_int_operand"))]
> +  ""
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand:DI 0 "address_operand" "p")
>              (match_operand:DI 1 "const_int_operand" "n")
> -            (match_operand:DI 2 "const_int_operand" "n"))]
> +            (match_operand:DI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    ""
>  {
>    static const char * const alt[2][4] = {
> diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
> index e0f0a582732..b5c547806b4 100644
> --- a/gcc/config/mips/mips.md
> +++ b/gcc/config/mips/mips.md
> @@ -7227,10 +7227,25 @@
>  ;;
>
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand:QI 0 "address_operand")
> +            (match_operand 1 "const_int_operand")
> +            (match_operand 2 "const_int_operand")
> +            (match_operand 3 "const_int_operand"))]
> +  "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand:QI 0 "address_operand" "ZD")
>              (match_operand 1 "const_int_operand" "n")
> -            (match_operand 2 "const_int_operand" "n"))]
> +            (match_operand 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
>  {
>    if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT)
> @@ -7257,7 +7272,8 @@
>    [(prefetch (plus:P (match_operand:P 0 "register_operand" "d")
>                      (match_operand:P 1 "register_operand" "d"))
>              (match_operand 2 "const_int_operand" "n")
> -            (match_operand 3 "const_int_operand" "n"))]
> +            (match_operand 3 "const_int_operand" "n")
> +            (const_int 1))]
>    "ISA_HAS_PREFETCHX && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT"
>  {
>    if (TARGET_LOONGSON_EXT)
> diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
> index 76ae35d4cfa..a7469074c01 100644
> --- a/gcc/config/pa/pa.md
> +++ b/gcc/config/pa/pa.md
> @@ -10201,9 +10201,16 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
>  (define_expand "prefetch"
>    [(match_operand 0 "address_operand" "")
>     (match_operand 1 "const_int_operand" "")
> -   (match_operand 2 "const_int_operand" "")]
> +   (match_operand 2 "const_int_operand" "")
> +   (match_operand 3 "const_int_operand" "")]
>    "TARGET_PA_20"
>  {
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +
>    operands[0] = copy_addr_to_reg (operands[0]);
>    emit_insn (gen_prefetch_20 (operands[0], operands[1], operands[2]));
>    DONE;
> @@ -10212,7 +10219,8 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
>  (define_insn "prefetch_20"
>    [(prefetch (match_operand 0 "pmode_register_operand" "r")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "TARGET_PA_20"
>  {
>    /* The SL cache-control completer indicates good spatial locality but
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index ad5a4cf2ef8..21ff09eca93 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -14060,10 +14060,25 @@
>    DONE;
>  })
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand 0 "indexed_or_indirect_address")
> +            (match_operand:SI 1 "const_int_operand")
> +            (match_operand:SI 2 "const_int_operand")
> +            (match_operand:SI 3 "const_int_operand"))]
> +  ""
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand 0 "indexed_or_indirect_address" "a")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    ""
>  {
>
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index ae309471f04..3fc5ae196b8 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -5697,13 +5697,13 @@ s390_expand_cpymem (rtx dst, rtx src, rtx len)
>
>           /* Issue a read prefetch for the +3 cache line.  */
>           prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, src_addr, GEN_INT (768)),
> -                                  const0_rtx, const0_rtx);
> +                                  const0_rtx, const0_rtx, const1_rtx);
>           PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>           emit_insn (prefetch);
>
>           /* Issue a write prefetch for the +3 cache line.  */
>           prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, GEN_INT (768)),
> -                                  const1_rtx, const0_rtx);
> +                                  const1_rtx, const0_rtx, const1_rtx);
>           PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>           emit_insn (prefetch);
>         }
> @@ -5872,7 +5872,7 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
>           /* Issue a write prefetch.  */
>           rtx distance = GEN_INT (TARGET_SETMEM_PREFETCH_DISTANCE);
>           rtx prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, dst_addr, distance),
> -                                      const1_rtx, const0_rtx);
> +                                      const1_rtx, const0_rtx, const1_rtx);
>           emit_insn (prefetch);
>           PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>         }
> @@ -5999,13 +5999,13 @@ s390_expand_cmpmem (rtx target, rtx op0, rtx op1, rtx len)
>
>           /* Issue a read prefetch for the +2 cache line of operand 1.  */
>           prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, addr0, GEN_INT (512)),
> -                                  const0_rtx, const0_rtx);
> +                                  const0_rtx, const0_rtx, const1_rtx);
>           emit_insn (prefetch);
>           PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>
>           /* Issue a read prefetch for the +2 cache line of operand 2.  */
>           prefetch = gen_prefetch (gen_rtx_PLUS (Pmode, addr1, GEN_INT (512)),
> -                                  const0_rtx, const0_rtx);
> +                                  const0_rtx, const0_rtx, const1_rtx);
>           emit_insn (prefetch);
>           PREFETCH_SCHEDULE_BARRIER_P (prefetch) = true;
>         }
> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
> index 962927c3112..4b094aa2bcf 100644
> --- a/gcc/config/s390/s390.md
> +++ b/gcc/config/s390/s390.md
> @@ -11601,10 +11601,25 @@
>  ; Data prefetch patterns
>  ;
>
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand 0    "address_operand")
> +            (match_operand:SI 1 "const_int_operand")
> +            (match_operand:SI 2 "const_int_operand")
> +             (match_operand:SI 3 "const_int_operand"))]
> +  "TARGET_Z10"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
> +
> +(define_insn "*prefetch"
>    [(prefetch (match_operand 0    "address_operand"   "ZT,X")
>              (match_operand:SI 1 "const_int_operand" " n,n")
> -            (match_operand:SI 2 "const_int_operand" " n,n"))]
> +            (match_operand:SI 2 "const_int_operand" " n,n")
> +             (const_int 1))]
>    "TARGET_Z10"
>  {
>    switch (which_alternative)
> diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
> index 59a7b216433..54a8270e80e 100644
> --- a/gcc/config/sh/sh.md
> +++ b/gcc/config/sh/sh.md
> @@ -10928,13 +10928,22 @@
>  (define_expand "prefetch"
>    [(prefetch (match_operand 0 "address_operand" "")
>              (match_operand:SI 1 "const_int_operand" "")
> -            (match_operand:SI 2 "const_int_operand" ""))]
> -  "(TARGET_SH2A || TARGET_SH3) && !TARGET_VXWORKS_RTP")
> +            (match_operand:SI 2 "const_int_operand" "")
> +            (match_operand:SI 3 "const_int_operand" ""))]
> +  "(TARGET_SH2A || TARGET_SH3) && !TARGET_VXWORKS_RTP"
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +})
>
>  (define_insn "*prefetch"
>    [(prefetch (match_operand:SI 0 "register_operand" "r")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    "(TARGET_SH2A || TARGET_SH3) && ! TARGET_VXWORKS_RTP"
>    "pref        @%0"
>    [(set_attr "type" "other")])
> diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
> index 691e707863a..04cb6935b1b 100644
> --- a/gcc/config/sparc/sparc.md
> +++ b/gcc/config/sparc/sparc.md
> @@ -7816,9 +7816,16 @@ visl")
>  (define_expand "prefetch"
>    [(match_operand 0 "address_operand" "")
>     (match_operand 1 "const_int_operand" "")
> -   (match_operand 2 "const_int_operand" "")]
> +   (match_operand 2 "const_int_operand" "")
> +   (match_operand 3 "const_int_operand" "")]
>    "TARGET_V9"
>  {
> +  if (INTVAL (operands[3]) == 0)
> +  {
> +    warning (0, "instruction prefetch is not supported; using data prefetch");
> +    operands[3] = const1_rtx;
> +  }
> +
>    if (TARGET_ARCH64)
>      emit_insn (gen_prefetch_64 (operands[0], operands[1], operands[2]));
>    else
> @@ -7829,7 +7836,8 @@ visl")
>  (define_insn "prefetch_64"
>    [(prefetch (match_operand:DI 0 "address_operand" "p")
>              (match_operand:DI 1 "const_int_operand" "n")
> -            (match_operand:DI 2 "const_int_operand" "n"))]
> +            (match_operand:DI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    ""
>  {
>    static const char * const prefetch_instr[2][2] = {
> @@ -7855,7 +7863,8 @@ visl")
>  (define_insn "prefetch_32"
>    [(prefetch (match_operand:SI 0 "address_operand" "p")
>              (match_operand:SI 1 "const_int_operand" "n")
> -            (match_operand:SI 2 "const_int_operand" "n"))]
> +            (match_operand:SI 2 "const_int_operand" "n")
> +            (const_int 1))]
>    ""
>  {
>    static const char * const prefetch_instr[2][2] = {
> diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
> index 43c9ee8bffe..592f4b0e4dd 100644
> --- a/gcc/doc/rtl.texi
> +++ b/gcc/doc/rtl.texi
> @@ -3454,7 +3454,7 @@ position of @var{base}, @var{min} and @var{max} to the containing insn
>  and of @var{min} and @var{max} to @var{base}.  See rtl.def for details.
>
>  @findex prefetch
> -@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality})
> +@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality} @var{cache})
>  Represents prefetch of memory at address @var{addr}.
>  Operand @var{rw} is 1 if the prefetch is for data to be written, 0 otherwise;
>  targets that do not support write prefetches should treat this as a normal
> @@ -3462,6 +3462,10 @@ prefetch.
>  Operand @var{locality} specifies the amount of temporal locality; 0 if there
>  is none or 1, 2, or 3 for increasing levels of temporal locality;
>  targets that do not support locality hints should ignore this.
> +Operand @var{cache} is 1 if the prefetch is prefetching data, 0 for prefetching
> +instruction;
> +targets that do not support instruction prefetch should treat all as data
> +prefetch.
>
>  This insn is used to minimize cache-miss latency by moving data into a
>  cache before it is accessed.  It should use only non-faulting data prefetch
> diff --git a/gcc/rtl.def b/gcc/rtl.def
> index 08e31fa3544..f2e37d55023 100644
> --- a/gcc/rtl.def
> +++ b/gcc/rtl.def
> @@ -277,10 +277,11 @@ DEF_RTL_EXPR(ADDR_DIFF_VEC, "addr_diff_vec", "eEee0", RTX_EXTRA)
>     Operand 3 is the level of temporal locality; 0 means there is no
>     temporal locality and 1, 2, and 3 are for increasing levels of temporal
>     locality.
> +   Operand 4 is 1 for prefetch data, 0 for prefetch instrction.
>
> -   The attributes specified by operands 2 and 3 are ignored for targets
> +   The attributes specified by operands 2, 3 and 4 are ignored for targets
>     whose prefetch instructions do not support them.  */
> -DEF_RTL_EXPR(PREFETCH, "prefetch", "eee", RTX_EXTRA)
> +DEF_RTL_EXPR(PREFETCH, "prefetch", "eeee", RTX_EXTRA)
>
>  /* ----------------------------------------------------------------------
>     At the top level of an instruction (perhaps under PARALLEL).
> diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
> index 56da7435a28..7eeef285f1e 100644
> --- a/gcc/rtlanal.cc
> +++ b/gcc/rtlanal.cc
> @@ -6196,7 +6196,7 @@ setup_reg_subrtx_bounds (unsigned int code)
>    while (format[i] == 'e');
>    rtx_all_subrtx_bounds[code].count = i - rtx_all_subrtx_bounds[code].start;
>    /* rtl-iter.h relies on this.  */
> -  gcc_checking_assert (rtx_all_subrtx_bounds[code].count <= 3);
> +  gcc_checking_assert (rtx_all_subrtx_bounds[code].count <= 4);
>
>    for (; format[i]; ++i)
>      if (format[i] == 'E' || format[i] == 'V' || format[i] == 'e')
> diff --git a/gcc/target-insns.def b/gcc/target-insns.def
> index de8c0092f98..ca13d1c4393 100644
> --- a/gcc/target-insns.def
> +++ b/gcc/target-insns.def
> @@ -76,7 +76,7 @@ DEF_TARGET_INSN (omp_simt_ordered, (rtx x0, rtx x1))
>  DEF_TARGET_INSN (omp_simt_vote_any, (rtx x0, rtx x1))
>  DEF_TARGET_INSN (omp_simt_xchg_bfly, (rtx x0, rtx x1, rtx x2))
>  DEF_TARGET_INSN (omp_simt_xchg_idx, (rtx x0, rtx x1, rtx x2))
> -DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2))
> +DEF_TARGET_INSN (prefetch, (rtx x0, rtx x1, rtx x2, rtx x3))
>  DEF_TARGET_INSN (probe_stack, (rtx x0))
>  DEF_TARGET_INSN (probe_stack_address, (rtx x0))
>  DEF_TARGET_INSN (prologue, (void))
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
> index 4ee05a94d9f..ccc5fab15e5 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-1.c
> @@ -1,57 +1,62 @@
>  /* Test that __builtin_prefetch does no harm.
>
> -   Prefetch using all valid combinations of rw and locality values.
> +   Prefetch using all valid combinations of cache, rw and locality values.
>     These must be compile-time constants.  */
>
>  #define NO_TEMPORAL_LOCALITY 0
>  #define LOW_TEMPORAL_LOCALITY 1
> -#define MODERATE_TEMPORAL_LOCALITY 1
> +#define MODERATE_TEMPORAL_LOCALITY 2
>  #define HIGH_TEMPORAL_LOCALITY 3
>
>  #define WRITE_ACCESS 1
>  #define READ_ACCESS 0
>
> +#define DATA_PRFCH 1
> +#define INST_PRFCH 0
> +
>  enum locality { none, low, moderate, high };
>  enum rw { read, write };
> +enum cache { inst, data };
>
>  int arr[10];
>
>  void
>  good_const (const int *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, READ_ACCESS, 3);
> -  __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY);
> -  __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY);
> -  __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY);
> -  __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, READ_ACCESS, 3, 1);
> +  __builtin_prefetch (p, 1, NO_TEMPORAL_LOCALITY, 1);
> +  __builtin_prefetch (p, 1, LOW_TEMPORAL_LOCALITY, 1);
> +  __builtin_prefetch (p, 1, MODERATE_TEMPORAL_LOCALITY, 1);
> +  __builtin_prefetch (p, WRITE_ACCESS, HIGH_TEMPORAL_LOCALITY, DATA_PRFCH);
>  }
>
>  void
>  good_enum (const int *p)
>  {
> -    __builtin_prefetch (p, read, none);
> -    __builtin_prefetch (p, read, low);
> -    __builtin_prefetch (p, read, moderate);
> -    __builtin_prefetch (p, read, high);
> -    __builtin_prefetch (p, write, none);
> -    __builtin_prefetch (p, write, low);
> -    __builtin_prefetch (p, write, moderate);
> -    __builtin_prefetch (p, write, high);
> +    __builtin_prefetch (p, read, none, data);
> +    __builtin_prefetch (p, read, low, data);
> +    __builtin_prefetch (p, read, moderate, data);
> +    __builtin_prefetch (p, read, high, data);
> +    __builtin_prefetch (p, write, none, data);
> +    __builtin_prefetch (p, write, low, data);
> +    __builtin_prefetch (p, write, moderate, data);
> +    __builtin_prefetch (p, write, high, data);
>  }
>
>  void
>  good_expr (const int *p)
>  {
> -  __builtin_prefetch (p, 1 - 1, 6 - (2 * 3));
> -  __builtin_prefetch (p, 1 + 0, 1 + 2);
> +  __builtin_prefetch (p, 1 - 1, 6 - (2 * 3), 1 + 0);
> +  __builtin_prefetch (p, 1 + 0, 1 + 2, 0 + 1);
>  }
>
>  void
>  good_vararg (const int *p)
>  {
> +  __builtin_prefetch (p, 0, 3, 1);
>    __builtin_prefetch (p, 0, 3);
>    __builtin_prefetch (p, 0);
>    __builtin_prefetch (p, 1);
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
> index 530a1b0ef0d..6aff1f281e0 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-2.c
> @@ -26,9 +26,9 @@ struct S *ptr_str = &str;
>  void
>  simple_global ()
>  {
> -  __builtin_prefetch (glob_int_arr, 0, 0);
> -  __builtin_prefetch (glob_ptr_int, 0, 0);
> -  __builtin_prefetch (&glob_int, 0, 0);
> +  __builtin_prefetch (glob_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_int, 0, 0, 1);
>  }
>
>  /* Prefetch file-level static variables using the address of the variable.  */
> @@ -36,9 +36,9 @@ simple_global ()
>  void
>  simple_file ()
>  {
> -  __builtin_prefetch (stat_int_arr, 0, 0);
> -  __builtin_prefetch (stat_ptr_int, 0, 0);
> -  __builtin_prefetch (&stat_int, 0, 0);
> +  __builtin_prefetch (stat_int_arr, 0, 0, 1);
> +  __builtin_prefetch (stat_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (&stat_int, 0, 0, 1);
>  }
>
>  /* Prefetch local static variables using the address of the variable.  */
> @@ -49,9 +49,9 @@ simple_static_local ()
>    static int gx[100];
>    static int *hx = gx;
>    static int ix;
> -  __builtin_prefetch (gx, 0, 0);
> -  __builtin_prefetch (hx, 0, 0);
> -  __builtin_prefetch (&ix, 0, 0);
> +  __builtin_prefetch (gx, 0, 0, 1);
> +  __builtin_prefetch (hx, 0, 0, 1);
> +  __builtin_prefetch (&ix, 0, 0, 1);
>  }
>
>  /* Prefetch local stack variables using the address of the variable.  */
> @@ -62,9 +62,9 @@ simple_local ()
>    int gx[100];
>    int *hx = gx;
>    int ix;
> -  __builtin_prefetch (gx, 0, 0);
> -  __builtin_prefetch (hx, 0, 0);
> -  __builtin_prefetch (&ix, 0, 0);
> +  __builtin_prefetch (gx, 0, 0, 1);
> +  __builtin_prefetch (hx, 0, 0, 1);
> +  __builtin_prefetch (&ix, 0, 0, 1);
>  }
>
>  /* Prefetch arguments using the address of the variable.  */
> @@ -72,9 +72,9 @@ simple_local ()
>  void
>  simple_arg (int g[100], int *h, int i)
>  {
> -  __builtin_prefetch (g, 0, 0);
> -  __builtin_prefetch (h, 0, 0);
> -  __builtin_prefetch (&i, 0, 0);
> +  __builtin_prefetch (g, 0, 0, 1);
> +  __builtin_prefetch (h, 0, 0, 1);
> +  __builtin_prefetch (&i, 0, 0, 1);
>  }
>
>  /* Prefetch using address expressions involving global variables.  */
> @@ -82,25 +82,25 @@ simple_arg (int g[100], int *h, int i)
>  void
>  expr_global (void)
>  {
> -  __builtin_prefetch (&str, 0, 0);
> -  __builtin_prefetch (ptr_str, 0, 0);
> -  __builtin_prefetch (&str.b, 0, 0);
> -  __builtin_prefetch (&ptr_str->b, 0, 0);
> -  __builtin_prefetch (&str.d, 0, 0);
> -  __builtin_prefetch (&ptr_str->d, 0, 0);
> -  __builtin_prefetch (str.next, 0, 0);
> -  __builtin_prefetch (ptr_str->next, 0, 0);
> -  __builtin_prefetch (str.next->d, 0, 0);
> -  __builtin_prefetch (ptr_str->next->d, 0, 0);
> -
> -  __builtin_prefetch (&glob_int_arr, 0, 0);
> -  __builtin_prefetch (glob_ptr_int, 0, 0);
> -  __builtin_prefetch (&glob_int_arr[2], 0, 0);
> -  __builtin_prefetch (&glob_ptr_int[3], 0, 0);
> -  __builtin_prefetch (glob_int_arr+3, 0, 0);
> -  __builtin_prefetch (glob_int_arr+glob_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_int+5, 0, 0);
> -  __builtin_prefetch (glob_ptr_int+glob_int, 0, 0);
> +  __builtin_prefetch (&str, 0, 0, 1);
> +  __builtin_prefetch (ptr_str, 0, 0, 1);
> +  __builtin_prefetch (&str.b, 0, 0, 1);
> +  __builtin_prefetch (&ptr_str->b, 0, 0, 1);
> +  __builtin_prefetch (&str.d, 0, 0, 1);
> +  __builtin_prefetch (&ptr_str->d, 0, 0, 1);
> +  __builtin_prefetch (str.next, 0, 0, 1);
> +  __builtin_prefetch (ptr_str->next, 0, 0, 1);
> +  __builtin_prefetch (str.next->d, 0, 0, 1);
> +  __builtin_prefetch (ptr_str->next->d, 0, 0, 1);
> +
> +  __builtin_prefetch (&glob_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_int_arr[2], 0, 0, 1);
> +  __builtin_prefetch (&glob_ptr_int[3], 0, 0, 1);
> +  __builtin_prefetch (glob_int_arr+3, 0, 0, 1);
> +  __builtin_prefetch (glob_int_arr+glob_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_int+glob_int, 0, 0, 1);
>  }
>
>  /* Prefetch using address expressions involving local variables.  */
> @@ -114,25 +114,25 @@ expr_local (void)
>    struct S *pt = &t;
>    int j = 4;
>
> -  __builtin_prefetch (&t, 0, 0);
> -  __builtin_prefetch (pt, 0, 0);
> -  __builtin_prefetch (&t.b, 0, 0);
> -  __builtin_prefetch (&pt->b, 0, 0);
> -  __builtin_prefetch (&t.d, 0, 0);
> -  __builtin_prefetch (&pt->d, 0, 0);
> -  __builtin_prefetch (t.next, 0, 0);
> -  __builtin_prefetch (pt->next, 0, 0);
> -  __builtin_prefetch (t.next->d, 0, 0);
> -  __builtin_prefetch (pt->next->d, 0, 0);
> -
> -  __builtin_prefetch (&b, 0, 0);
> -  __builtin_prefetch (pb, 0, 0);
> -  __builtin_prefetch (&b[2], 0, 0);
> -  __builtin_prefetch (&pb[3], 0, 0);
> -  __builtin_prefetch (b+3, 0, 0);
> -  __builtin_prefetch (b+j, 0, 0);
> -  __builtin_prefetch (pb+5, 0, 0);
> -  __builtin_prefetch (pb+j, 0, 0);
> +  __builtin_prefetch (&t, 0, 0, 1);
> +  __builtin_prefetch (pt, 0, 0, 1);
> +  __builtin_prefetch (&t.b, 0, 0, 1);
> +  __builtin_prefetch (&pt->b, 0, 0, 1);
> +  __builtin_prefetch (&t.d, 0, 0, 1);
> +  __builtin_prefetch (&pt->d, 0, 0, 1);
> +  __builtin_prefetch (t.next, 0, 0, 1);
> +  __builtin_prefetch (pt->next, 0, 0, 1);
> +  __builtin_prefetch (t.next->d, 0, 0, 1);
> +  __builtin_prefetch (pt->next->d, 0, 0, 1);
> +
> +  __builtin_prefetch (&b, 0, 0, 1);
> +  __builtin_prefetch (pb, 0, 0, 1);
> +  __builtin_prefetch (&b[2], 0, 0, 1);
> +  __builtin_prefetch (&pb[3], 0, 0, 1);
> +  __builtin_prefetch (b+3, 0, 0, 1);
> +  __builtin_prefetch (b+j, 0, 0, 1);
> +  __builtin_prefetch (pb+5, 0, 0, 1);
> +  __builtin_prefetch (pb+j, 0, 0, 1);
>  }
>
>  int
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
> index 2e2e808c172..38ce410384a 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-3.c
> @@ -36,11 +36,11 @@ volatile struct S * volatile vol_ptr_vol_str = &vol_str;
>  void
>  simple_vol_global ()
>  {
> -  __builtin_prefetch (glob_vol_int_arr, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (&glob_vol_int, 0, 0);
> +  __builtin_prefetch (glob_vol_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_int, 0, 0, 1);
>  }
>
>  /* Prefetch volatile static variables using the address of the variable.  */
> @@ -48,11 +48,11 @@ simple_vol_global ()
>  void
>  simple_vol_file ()
>  {
> -  __builtin_prefetch (stat_vol_int_arr, 0, 0);
> -  __builtin_prefetch (stat_vol_ptr_int, 0, 0);
> -  __builtin_prefetch (stat_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (&stat_vol_int, 0, 0);
> +  __builtin_prefetch (stat_vol_int_arr, 0, 0, 1);
> +  __builtin_prefetch (stat_vol_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (stat_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (stat_vol_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (&stat_vol_int, 0, 0, 1);
>  }
>
>  /* Prefetch using address expressions involving volatile global variables.  */
> @@ -60,43 +60,43 @@ simple_vol_file ()
>  void
>  expr_vol_global (void)
>  {
> -  __builtin_prefetch (&vol_str, 0, 0);
> -  __builtin_prefetch (ptr_vol_str, 0, 0);
> -  __builtin_prefetch (vol_ptr_str, 0, 0);
> -  __builtin_prefetch (vol_ptr_vol_str, 0, 0);
> -  __builtin_prefetch (&vol_str.b, 0, 0);
> -  __builtin_prefetch (&ptr_vol_str->b, 0, 0);
> -  __builtin_prefetch (&vol_ptr_str->b, 0, 0);
> -  __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0);
> -  __builtin_prefetch (&vol_str.d, 0, 0);
> -  __builtin_prefetch (&vol_ptr_str->d, 0, 0);
> -  __builtin_prefetch (&ptr_vol_str->d, 0, 0);
> -  __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0);
> -  __builtin_prefetch (vol_str.next, 0, 0);
> -  __builtin_prefetch (vol_ptr_str->next, 0, 0);
> -  __builtin_prefetch (ptr_vol_str->next, 0, 0);
> -  __builtin_prefetch (vol_ptr_vol_str->next, 0, 0);
> -  __builtin_prefetch (vol_str.next->d, 0, 0);
> -  __builtin_prefetch (vol_ptr_str->next->d, 0, 0);
> -  __builtin_prefetch (ptr_vol_str->next->d, 0, 0);
> -  __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0);
> +  __builtin_prefetch (&vol_str, 0, 0, 1);
> +  __builtin_prefetch (ptr_vol_str, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_str, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_vol_str, 0, 0, 1);
> +  __builtin_prefetch (&vol_str.b, 0, 0, 1);
> +  __builtin_prefetch (&ptr_vol_str->b, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_str->b, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_vol_str->b, 0, 0, 1);
> +  __builtin_prefetch (&vol_str.d, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_str->d, 0, 0, 1);
> +  __builtin_prefetch (&ptr_vol_str->d, 0, 0, 1);
> +  __builtin_prefetch (&vol_ptr_vol_str->d, 0, 0, 1);
> +  __builtin_prefetch (vol_str.next, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_str->next, 0, 0, 1);
> +  __builtin_prefetch (ptr_vol_str->next, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_vol_str->next, 0, 0, 1);
> +  __builtin_prefetch (vol_str.next->d, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_str->next->d, 0, 0, 1);
> +  __builtin_prefetch (ptr_vol_str->next->d, 0, 0, 1);
> +  __builtin_prefetch (vol_ptr_vol_str->next->d, 0, 0, 1);
>
> -  __builtin_prefetch (&glob_vol_int_arr, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0);
> -  __builtin_prefetch (&glob_vol_int_arr[2], 0, 0);
> -  __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0);
> -  __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0);
> -  __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0);
> -  __builtin_prefetch (glob_vol_int_arr+3, 0, 0);
> -  __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int+5, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int+5, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0);
> -  __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0);
> -  __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0);
> +  __builtin_prefetch (&glob_vol_int_arr, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int, 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_int_arr[2], 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_ptr_int[3], 0, 0, 1);
> +  __builtin_prefetch (&glob_ptr_vol_int[3], 0, 0, 1);
> +  __builtin_prefetch (&glob_vol_ptr_vol_int[3], 0, 0, 1);
> +  __builtin_prefetch (glob_vol_int_arr+3, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_int_arr+glob_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int+5, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_int+glob_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_ptr_vol_int+glob_vol_int, 0, 0, 1);
> +  __builtin_prefetch (glob_vol_ptr_vol_int+glob_vol_int, 0, 0, 1);
>  }
>
>  int
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
> index ade892b21a7..69b4cbe1854 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
> @@ -17,7 +17,7 @@ int
>  assign_arg_ptr (int *p)
>  {
>    int *q;
> -  __builtin_prefetch ((q = p), 0, 0);
> +  __builtin_prefetch ((q = p), 0, 0, 1);
>    return q == p;
>  }
>
> @@ -25,7 +25,7 @@ int
>  assign_glob_ptr (void)
>  {
>    int *q;
> -  __builtin_prefetch ((q = ptr), 0, 0);
> +  __builtin_prefetch ((q = ptr), 0, 0, 1);
>    return q == ptr;
>  }
>
> @@ -33,7 +33,7 @@ int
>  assign_arg_idx (int *p, int i)
>  {
>    int j;
> -  __builtin_prefetch (&p[j = i], 0, 0);
> +  __builtin_prefetch (&p[j = i], 0, 0, 1);
>    return j == i;
>  }
>
> @@ -41,7 +41,7 @@ int
>  assign_glob_idx (void)
>  {
>    int j;
> -  __builtin_prefetch (&ptr[j = arrindex], 0, 0);
> +  __builtin_prefetch (&ptr[j = arrindex], 0, 0, 1);
>    return j == arrindex;
>  }
>
> @@ -53,7 +53,7 @@ preinc_arg_ptr (int *p)
>  {
>    int *q;
>    q = p + 1;
> -  __builtin_prefetch (++p, 0, 0);
> +  __builtin_prefetch (++p, 0, 0, 1);
>    return p == q;
>  }
>
> @@ -62,7 +62,7 @@ preinc_glob_ptr (void)
>  {
>    int *q;
>    q = ptr + 1;
> -  __builtin_prefetch (++ptr, 0, 0);
> +  __builtin_prefetch (++ptr, 0, 0, 1);
>    return ptr == q;
>  }
>
> @@ -71,7 +71,7 @@ postinc_arg_ptr (int *p)
>  {
>    int *q;
>    q = p + 1;
> -  __builtin_prefetch (p++, 0, 0);
> +  __builtin_prefetch (p++, 0, 0, 1);
>    return p == q;
>  }
>
> @@ -80,7 +80,7 @@ postinc_glob_ptr (void)
>  {
>    int *q;
>    q = ptr + 1;
> -  __builtin_prefetch (ptr++, 0, 0);
> +  __builtin_prefetch (ptr++, 0, 0, 1);
>    return ptr == q;
>  }
>
> @@ -89,7 +89,7 @@ predec_arg_ptr (int *p)
>  {
>    int *q;
>    q = p - 1;
> -  __builtin_prefetch (--p, 0, 0);
> +  __builtin_prefetch (--p, 0, 0, 1);
>    return p == q;
>  }
>
> @@ -98,7 +98,7 @@ predec_glob_ptr (void)
>  {
>    int *q;
>    q = ptr - 1;
> -  __builtin_prefetch (--ptr, 0, 0);
> +  __builtin_prefetch (--ptr, 0, 0, 1);
>    return ptr == q;
>  }
>
> @@ -107,7 +107,7 @@ postdec_arg_ptr (int *p)
>  {
>    int *q;
>    q = p - 1;
> -  __builtin_prefetch (p--, 0, 0);
> +  __builtin_prefetch (p--, 0, 0, 1);
>    return p == q;
>  }
>
> @@ -116,7 +116,7 @@ postdec_glob_ptr (void)
>  {
>    int *q;
>    q = ptr - 1;
> -  __builtin_prefetch (ptr--, 0, 0);
> +  __builtin_prefetch (ptr--, 0, 0, 1);
>    return ptr == q;
>  }
>
> @@ -124,7 +124,7 @@ int
>  preinc_arg_idx (int *p, int i)
>  {
>    int j = i + 1;
> -  __builtin_prefetch (&p[++i], 0, 0);
> +  __builtin_prefetch (&p[++i], 0, 0, 1);
>    return i == j;
>  }
>
> @@ -133,7 +133,7 @@ int
>  preinc_glob_idx (void)
>  {
>    int j = arrindex + 1;
> -  __builtin_prefetch (&ptr[++arrindex], 0, 0);
> +  __builtin_prefetch (&ptr[++arrindex], 0, 0, 1);
>    return arrindex == j;
>  }
>
> @@ -141,7 +141,7 @@ int
>  postinc_arg_idx (int *p, int i)
>  {
>    int j = i + 1;
> -  __builtin_prefetch (&p[i++], 0, 0);
> +  __builtin_prefetch (&p[i++], 0, 0, 1);
>    return i == j;
>  }
>
> @@ -149,7 +149,7 @@ int
>  postinc_glob_idx (void)
>  {
>    int j = arrindex + 1;
> -  __builtin_prefetch (&ptr[arrindex++], 0, 0);
> +  __builtin_prefetch (&ptr[arrindex++], 0, 0, 1);
>    return arrindex == j;
>  }
>
> @@ -157,7 +157,7 @@ int
>  predec_arg_idx (int *p, int i)
>  {
>    int j = i - 1;
> -  __builtin_prefetch (&p[--i], 0, 0);
> +  __builtin_prefetch (&p[--i], 0, 0, 1);
>    return i == j;
>  }
>
> @@ -165,7 +165,7 @@ int
>  predec_glob_idx (void)
>  {
>    int j = arrindex - 1;
> -  __builtin_prefetch (&ptr[--arrindex], 0, 0);
> +  __builtin_prefetch (&ptr[--arrindex], 0, 0, 1);
>    return arrindex == j;
>  }
>
> @@ -173,7 +173,7 @@ int
>  postdec_arg_idx (int *p, int i)
>  {
>    int j = i - 1;
> -  __builtin_prefetch (&p[i--], 0, 0);
> +  __builtin_prefetch (&p[i--], 0, 0, 1);
>    return i == j;
>  }
>
> @@ -181,7 +181,7 @@ int
>  postdec_glob_idx (void)
>  {
>    int j = arrindex - 1;
> -  __builtin_prefetch (&ptr[arrindex--], 0, 0);
> +  __builtin_prefetch (&ptr[arrindex--], 0, 0, 1);
>    return arrindex == j;
>  }
>
> @@ -200,7 +200,7 @@ getptr (int *p)
>  int
>  funccall_arg_ptr (int *p)
>  {
> -  __builtin_prefetch (getptr (p), 0, 0);
> +  __builtin_prefetch (getptr (p), 0, 0, 1);
>    return getptrcnt == 1;
>  }
>
> @@ -216,7 +216,7 @@ getint (int i)
>  int
>  funccall_arg_idx (int *p, int i)
>  {
> -  __builtin_prefetch (&p[getint (i)], 0, 0);
> +  __builtin_prefetch (&p[getint (i)], 0, 0, 1);
>    return getintcnt == 1;
>  }
>
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
> index f42a2c0ca87..a6fa1741888 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-5.c
> @@ -18,32 +18,32 @@ int idx = 3;
>  void
>  arg_ptr (char *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> +  __builtin_prefetch (p, 0, 0, 1);
>  }
>
>  void
>  arg_idx (char *p, int i)
>  {
> -  __builtin_prefetch (&p[i], 0, 0);
> +  __builtin_prefetch (&p[i], 0, 0, 1);
>  }
>
>  void
>  glob_ptr (void)
>  {
> -  __builtin_prefetch (ptr, 0, 0);
> +  __builtin_prefetch (ptr, 0, 0, 1);
>  }
>
>  void
>  glob_idx (void)
>  {
> -  __builtin_prefetch (&ptr[idx], 0, 0);
> +  __builtin_prefetch (&ptr[idx], 0, 0, 1);
>  }
>
>  int
>  main ()
>  {
> -  __builtin_prefetch (&s.b, 0, 0);
> -  __builtin_prefetch (&s.c[1], 0, 0);
> +  __builtin_prefetch (&s.b, 0, 0, 1);
> +  __builtin_prefetch (&s.c[1], 0, 0, 1);
>
>    arg_ptr (&s.c[1]);
>    arg_ptr (ptr+3);
> diff --git a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
> index f643c5c7286..fabecaf56dc 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-6.c
> @@ -25,7 +25,7 @@ prefetch_for_read (void)
>  {
>    int i;
>    for (i = 0; i < ARRSIZE; i++)
> -    __builtin_prefetch (bad_addr[i], 0, 0);
> +    __builtin_prefetch (bad_addr[i], 0, 0, 1);
>  }
>
>  void
> @@ -33,7 +33,7 @@ prefetch_for_write (void)
>  {
>    int i;
>    for (i = 0; i < ARRSIZE; i++)
> -    __builtin_prefetch (bad_addr[i], 1, 0);
> +    __builtin_prefetch (bad_addr[i], 1, 0, 1);
>  }
>
>  int
> diff --git a/gcc/testsuite/gcc.dg/builtin-prefetch-1.c b/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
> index 11beb4e1bbe..84d564dc72c 100644
> --- a/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
> +++ b/gcc/testsuite/gcc.dg/builtin-prefetch-1.c
> @@ -1,6 +1,6 @@
>  /* Test that __builtin_prefetch does no harm.
>
> -   Prefetch using some invalid rw and locality values.  These must be
> +   Prefetch using some invalid cache, rw and locality values.  These must be
>     compile-time constants.  */
>
>  /* { dg-do run } */
> @@ -9,6 +9,7 @@ extern void exit (int);
>
>  enum locality { none, low, moderate, high, bogus };
>  enum rw { read, write };
> +enum cache { inst, data };
>
>  int arr[10];
>
> @@ -34,6 +35,8 @@ bad (int *p)
>    __builtin_prefetch (p, 0, -1);  /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
>    __builtin_prefetch (p, 0, 4);   /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
>    __builtin_prefetch (p, 0, bogus);   /* { dg-warning "invalid third argument to '__builtin_prefetch'; using zero" } */
> +  __builtin_prefetch (p, 0, 3, -1);   /* { dg-warning "invalid fourth argument to '__builtin_prefetch'; using one" } */
> +  __builtin_prefetch (p, 0, 3, bogus);   /* { dg-warning "invalid fourth argument to '__builtin_prefetch'; using one" } */
>  }
>
>  int
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
> index 638749a5a68..eb9197b357c 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-3dnow-1.c
> @@ -9,14 +9,14 @@ char *msg = "howdy there";
>
>  void foo (char *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>  }
>
>  int main ()
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
> index d793437f175..b5081815f7a 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-athlon-1.c
> @@ -10,14 +10,14 @@ char *msg = "howdy there";
>
>  void foo (char *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>  }
>
>  int main ()
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
> index 04e814d5a9c..2317f665107 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-none-1.c
> @@ -9,14 +9,14 @@ char *msg = "howdy there";
>
>  void foo (char *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>  }
>
>  int main ()
> diff --git a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
> index 3707c7074be..936ad9e79ad 100644
> --- a/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
> +++ b/gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c
> @@ -9,14 +9,14 @@ char *msg = "howdy there";
>
>  void foo (char *p)
>  {
> -  __builtin_prefetch (p, 0, 0);
> -  __builtin_prefetch (p, 0, 1);
> -  __builtin_prefetch (p, 0, 2);
> -  __builtin_prefetch (p, 0, 3);
> -  __builtin_prefetch (p, 1, 0);
> -  __builtin_prefetch (p, 1, 1);
> -  __builtin_prefetch (p, 1, 2);
> -  __builtin_prefetch (p, 1, 3);
> +  __builtin_prefetch (p, 0, 0, 1);
> +  __builtin_prefetch (p, 0, 1, 1);
> +  __builtin_prefetch (p, 0, 2, 1);
> +  __builtin_prefetch (p, 0, 3, 1);
> +  __builtin_prefetch (p, 1, 0, 1);
> +  __builtin_prefetch (p, 1, 1, 1);
> +  __builtin_prefetch (p, 1, 2, 1);
> +  __builtin_prefetch (p, 1, 3, 1);
>  }
>
>  int main ()
> diff --git a/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c b/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
> new file mode 100644
> index 00000000000..f082396ac2e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/alpha/prefetchi-1.c b/gcc/testsuite/gcc.target/alpha/prefetchi-1.c
> new file mode 100644
> index 00000000000..5d9c387e260
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/alpha/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mcpu=ev6" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/arc/prefetchi-1.c b/gcc/testsuite/gcc.target/arc/prefetchi-1.c
> new file mode 100644
> index 00000000000..7e023ab6498
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arc/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mcpu=archs" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/arm/prefetchi-1.c b/gcc/testsuite/gcc.target/arm/prefetchi-1.c
> new file mode 100644
> index 00000000000..0fbcb7019bc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { ia32 } } } */
> +/* { dg-options "-O2 -march=armv5te" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/hppa/prefetchi-1.c b/gcc/testsuite/gcc.target/hppa/prefetchi-1.c
> new file mode 100644
> index 00000000000..26854a6828d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/hppa/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mpa-risc-2-0" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c
> index 051a1b59b5b..ea0b9f6bcef 100644
> --- a/gcc/testsuite/gcc.target/i386/avx-1.c
> +++ b/gcc/testsuite/gcc.target/i386/avx-1.c
> @@ -153,7 +153,7 @@
>  #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
>
>  /* xmmintrin.h */
> -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
> +#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
>  #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
>  #define __builtin_ia32_vec_set_v4hi(A, D, N) \
>    __builtin_ia32_vec_set_v4hi(A, D, 0)
> diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-1.c b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
> new file mode 100644
> index 00000000000..b32d59f2e5f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad(const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c
> index ca662f7bd47..6c9742cf494 100644
> --- a/gcc/testsuite/gcc.target/i386/sse-13.c
> +++ b/gcc/testsuite/gcc.target/i386/sse-13.c
> @@ -125,7 +125,7 @@
>  #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
>
>  /* xmmintrin.h */
> -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
> +#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
>  #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
>  #define __builtin_ia32_vec_set_v4hi(A, D, N) \
>    __builtin_ia32_vec_set_v4hi(A, D, 0)
> diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c
> index ba1310f9f89..344913e9a90 100644
> --- a/gcc/testsuite/gcc.target/i386/sse-23.c
> +++ b/gcc/testsuite/gcc.target/i386/sse-23.c
> @@ -94,7 +94,7 @@
>  #define __builtin_ia32_shufpd(A, B, N) __builtin_ia32_shufpd(A, B, 0)
>
>  /* xmmintrin.h */
> -#define __builtin_prefetch(P, A, I) __builtin_prefetch(P, 0, _MM_HINT_NTA)
> +#define __builtin_prefetch(P, ...) __builtin_prefetch(P, 0, _MM_HINT_NTA)
>  #define __builtin_ia32_pshufw(A, N) __builtin_ia32_pshufw(A, 0)
>  #define __builtin_ia32_vec_set_v4hi(A, D, N) \
>    __builtin_ia32_vec_set_v4hi(A, D, 0)
> diff --git a/gcc/testsuite/gcc.target/ia64/prefetchi-1.c b/gcc/testsuite/gcc.target/ia64/prefetchi-1.c
> new file mode 100644
> index 00000000000..f082396ac2e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/ia64/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/mips/prefetchi-1.c b/gcc/testsuite/gcc.target/mips/prefetchi-1.c
> new file mode 100644
> index 00000000000..23e78a0c7ba
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/mips/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mips4 -mexplicit-relocs" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c b/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
> new file mode 100644
> index 00000000000..f082396ac2e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/s390/prefetchi-1.c b/gcc/testsuite/gcc.target/s390/prefetchi-1.c
> new file mode 100644
> index 00000000000..5ef557f1d8c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mzarch -march=z10" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/sh/prefetchi-1.c b/gcc/testsuite/gcc.target/sh/prefetchi-1.c
> new file mode 100644
> index 00000000000..347bdea8df8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/sh/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { has_pref } } } */
> +/* { dg-options "-O2" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> diff --git a/gcc/testsuite/gcc.target/sparc/prefetchi-1.c b/gcc/testsuite/gcc.target/sparc/prefetchi-1.c
> new file mode 100644
> index 00000000000..1bd7ad495e2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/sparc/prefetchi-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mcpu=v9" } */
> +
> +/* Remind users that instruction prefetch is not supported yet.  */
> +
> +void
> +bad (const int* p)
> +{
> +  __builtin_prefetch(p, 0, 3, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +  __builtin_prefetch(p, 0, 2, 0);      /* { dg-warning "instruction prefetch is not supported; using data prefetch" } */
> +}
> --
> 2.18.1
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/2] Add a Fourth parameter for prefetch and Support Intel PREFETCHI
  2022-10-14  8:34 [PATCH 0/2] Add a Fourth parameter for prefetch and Support Intel PREFETCHI Haochen Jiang
  2022-10-14  8:34 ` [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM Haochen Jiang
  2022-10-14  8:34 ` [PATCH 2/2] Support Intel prefetchit0/t1 Haochen Jiang
@ 2022-10-19 20:49 ` Segher Boessenkool
  2 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2022-10-19 20:49 UTC (permalink / raw)
  To: Haochen Jiang
  Cc: gcc-patches, rguenther, hongtao.liu, ubizjak, richard.earnshaw,
	richard.sandiford, marcus.shawcroft, kyrylo.tkachov, rth, gnu,
	claziss, nickc, ramana.radhakrishnan, aoliva, hubicka, mfortune,
	dje.gcc, linkw, uweigand, krebbel, olegendo, davem, ebotcazou,
	jeffreyalaw, dave.anglin

On Fri, Oct 14, 2022 at 04:34:04PM +0800, Haochen Jiang wrote:
> Sorry for the previous cover-letter stucking and disturbance and this
> is the right cover letter.

Hi!

Nothing in this cover letter tells me why you sent it to me?  Maybe you
shouldn't have at all, just one patch touches Power code, or such?


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-14  8:34 ` [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM Haochen Jiang
                     ` (2 preceding siblings ...)
  2022-10-19 17:14   ` [PATCH 1/2] " Andrew Pinski
@ 2022-10-19 21:06   ` Segher Boessenkool
  2022-10-20  1:39     ` Hongtao Liu
  2022-10-20  7:34     ` Jiang, Haochen
  3 siblings, 2 replies; 26+ messages in thread
From: Segher Boessenkool @ 2022-10-19 21:06 UTC (permalink / raw)
  To: Haochen Jiang
  Cc: gcc-patches, rguenther, hongtao.liu, ubizjak, richard.earnshaw,
	richard.sandiford, marcus.shawcroft, kyrylo.tkachov, rth, gnu,
	claziss, nickc, ramana.radhakrishnan, aoliva, hubicka, mfortune,
	dje.gcc, linkw, uweigand, krebbel, olegendo, davem, ebotcazou,
	jeffreyalaw, dave.anglin

On Fri, Oct 14, 2022 at 04:34:05PM +0800, Haochen Jiang wrote:
> 	* config/s390/s390.cc (s390_expand_cpymem): Generate fourth parameter for

(Many too long lines here, this is the first one.  Changelog lines are
max. 80 positions; a tab is eight).

> +  /* Argument 3 must be either zero or one.  */
> +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> +    {
> +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> +	" using one");

"using 1" makes sense maybe, but "using one" reads as "using an
argument", not very sane.

An error would be better here anyway?

> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -14060,10 +14060,25 @@
>    DONE;
>  })
>  
> -(define_insn "prefetch"
> +(define_expand "prefetch"
> +  [(prefetch (match_operand 0 "indexed_or_indirect_address")
> +	     (match_operand:SI 1 "const_int_operand")
> +	     (match_operand:SI 2 "const_int_operand")
> +	     (match_operand:SI 3 "const_int_operand"))]
> +  ""
> +{
> +  if (INTVAL (operands[3]) == 0)
> +  {

Broken indentation.

> +    warning (0, "instruction prefetch is not supported; using data prefetch");

Please use a separate pattern for this, and leave prefetch to mean data
prefetch, as documented!  Documentation you didn't change btw.  Call the
new one instruction_prefetch or something equally boring maybe :-)

When you send an updated patch, please split it up better?  Generic
changes and documentation in one patch, target changes in a separate
patch or patches, and testsuite is distinct as well.  It isn't nice to
have to scroll through thousands of lines to see if there is anything
relevant to you.

Thanks,


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-19 17:14   ` [PATCH 1/2] " Andrew Pinski
@ 2022-10-19 21:14     ` Segher Boessenkool
  2022-10-20  1:27       ` Hongtao Liu
  2022-10-20  1:44       ` Jiang, Haochen
  0 siblings, 2 replies; 26+ messages in thread
From: Segher Boessenkool @ 2022-10-19 21:14 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Haochen Jiang, gcc-patches, aoliva, richard.sandiford, uweigand,
	linkw, gnu, dje.gcc, olegendo, claziss, mfortune, davem,
	dave.anglin, hubicka, richard.earnshaw, rguenther,
	marcus.shawcroft, ramana.radhakrishnan, hongtao.liu

On Wed, Oct 19, 2022 at 10:14:28AM -0700, Andrew Pinski wrote:
> Do the testcases really need to be changed rather than adding new testcases?
> Usually it is better if the testcases not change unless really needed
> to be. That is do these testcases pass without being changed? If not
> this seems not backwards compatible change and is not something which
> we should do.  Otherwise you should just add new testcases instead.

Yes, that is another reason why adding parameters to random builtins is
not a good idea :-)  s/random/only vaguely related/, if you want.

This also makes all existing code using these builtins invalid.  If you
need such testcase changes, that is a red flag.


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-19 21:14     ` Segher Boessenkool
@ 2022-10-20  1:27       ` Hongtao Liu
  2022-10-20  1:44       ` Jiang, Haochen
  1 sibling, 0 replies; 26+ messages in thread
From: Hongtao Liu @ 2022-10-20  1:27 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Andrew Pinski, mfortune, dave.anglin, rguenther, aoliva,
	richard.sandiford, uweigand, hubicka, marcus.shawcroft, olegendo,
	gcc-patches, linkw, richard.earnshaw, ramana.radhakrishnan,
	davem, gnu, hongtao.liu, claziss, dje.gcc

On Thu, Oct 20, 2022 at 5:15 AM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Wed, Oct 19, 2022 at 10:14:28AM -0700, Andrew Pinski wrote:
> > Do the testcases really need to be changed rather than adding new testcases?
> > Usually it is better if the testcases not change unless really needed
> > to be. That is do these testcases pass without being changed? If not
> > this seems not backwards compatible change and is not something which
> > we should do.  Otherwise you should just add new testcases instead.
>
> Yes, that is another reason why adding parameters to random builtins is
> not a good idea :-)  s/random/only vaguely related/, if you want.
>
> This also makes all existing code using these builtins invalid.  If you
> need such testcase changes, that is a red flag.
The default behavior is the same as before(default data prefetch,
we'll unchange those testcases.)
>
>
> Segher



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-19 21:06   ` Segher Boessenkool
@ 2022-10-20  1:39     ` Hongtao Liu
  2022-10-20  3:12       ` Hongtao Liu
  2022-10-20  7:34     ` Jiang, Haochen
  1 sibling, 1 reply; 26+ messages in thread
From: Hongtao Liu @ 2022-10-20  1:39 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Haochen Jiang, aoliva, richard.sandiford, uweigand, linkw, gnu,
	dje.gcc, olegendo, claziss, mfortune, gcc-patches, davem,
	dave.anglin, hubicka, richard.earnshaw, rguenther,
	marcus.shawcroft, ramana.radhakrishnan, hongtao.liu

On Thu, Oct 20, 2022 at 5:08 AM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Fri, Oct 14, 2022 at 04:34:05PM +0800, Haochen Jiang wrote:
> >       * config/s390/s390.cc (s390_expand_cpymem): Generate fourth parameter for
>
> (Many too long lines here, this is the first one.  Changelog lines are
> max. 80 positions; a tab is eight).
>
> > +  /* Argument 3 must be either zero or one.  */
> > +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> > +    {
> > +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> > +     " using one");
>
> "using 1" makes sense maybe, but "using one" reads as "using an
> argument", not very sane.
>
> An error would be better here anyway?
>
> > --- a/gcc/config/rs6000/rs6000.md
> > +++ b/gcc/config/rs6000/rs6000.md
> > @@ -14060,10 +14060,25 @@
> >    DONE;
> >  })
> >
> > -(define_insn "prefetch"
> > +(define_expand "prefetch"
> > +  [(prefetch (match_operand 0 "indexed_or_indirect_address")
> > +          (match_operand:SI 1 "const_int_operand")
> > +          (match_operand:SI 2 "const_int_operand")
> > +          (match_operand:SI 3 "const_int_operand"))]
> > +  ""
> > +{
> > +  if (INTVAL (operands[3]) == 0)
> > +  {
>
> Broken indentation.
>
> > +    warning (0, "instruction prefetch is not supported; using data prefetch");
>
> Please use a separate pattern for this, and leave prefetch to mean data
> prefetch, as documented!  Documentation you didn't change btw.  Call the
> new one instruction_prefetch or something equally boring maybe :-)
>
> When you send an updated patch, please split it up better?  Generic
> changes and documentation in one patch, target changes in a separate
We'll split testcase into a separate patch.
> patch or patches, and testsuite is distinct as well.  It isn't nice to
> have to scroll through thousands of lines to see if there is anything
> relevant to you.

Yes, it's an inconvenience for review, sorry for that. But since we've
changed rtl def for prefetch, moving the general part into a separate
commit may break bootstrap when rtl-check is enabled.
-DEF_RTL_EXPR(PREFETCH, "prefetch", "eee", RTX_EXTRA)
+DEF_RTL_EXPR(PREFETCH, "prefetch", "eeee", RTX_EXTRA)

And we want to make each commit pass the bootstrap and regression test.
>
> Thanks,
>
>
> Segher



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-19 21:14     ` Segher Boessenkool
  2022-10-20  1:27       ` Hongtao Liu
@ 2022-10-20  1:44       ` Jiang, Haochen
  2022-10-20 17:25         ` Segher Boessenkool
  1 sibling, 1 reply; 26+ messages in thread
From: Jiang, Haochen @ 2022-10-20  1:44 UTC (permalink / raw)
  To: Segher Boessenkool, Andrew Pinski
  Cc: gcc-patches, aoliva, richard.sandiford, uweigand, linkw, gnu,
	dje.gcc, olegendo, claziss, mfortune, davem, dave.anglin,
	hubicka, richard.earnshaw, rguenther, marcus.shawcroft,
	ramana.radhakrishnan, Liu, Hongtao

> -----Original Message-----
> From: Segher Boessenkool <segher@kernel.crashing.org>
> Sent: Thursday, October 20, 2022 5:14 AM
> To: Andrew Pinski <pinskia@gmail.com>
> Cc: Jiang, Haochen <haochen.jiang@intel.com>; gcc-patches@gcc.gnu.org;
> aoliva@gcc.gnu.org; richard.sandiford@arm.com; uweigand@de.ibm.com;
> linkw@gcc.gnu.org; gnu@amylaar.uk; dje.gcc@gmail.com;
> olegendo@gcc.gnu.org; claziss@synopsys.com; mfortune@gmail.com;
> davem@redhat.com; dave.anglin@bell.net; hubicka@ucw.cz;
> richard.earnshaw@arm.com; rguenther@suse.de;
> marcus.shawcroft@arm.com; ramana.radhakrishnan@arm.com; Liu, Hongtao
> <hongtao.liu@intel.com>
> Subject: Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch
> to align with LLVM
> 
> On Wed, Oct 19, 2022 at 10:14:28AM -0700, Andrew Pinski wrote:
> > Do the testcases really need to be changed rather than adding new
> testcases?
> > Usually it is better if the testcases not change unless really needed
> > to be. That is do these testcases pass without being changed? If not
> > this seems not backwards compatible change and is not something which
> > we should do.  Otherwise you should just add new testcases instead.
> 
> Yes, that is another reason why adding parameters to random builtins is not a
> good idea :-)  s/random/only vaguely related/, if you want.
> 
> This also makes all existing code using these builtins invalid.  If you need such
> testcase changes, that is a red flag.
> 

Maybe the testcase change cause some misunderstanding and concern.

Actually, the patch did not disrupt the previous builtins, as the builtin_prefetch
uses vargs. I set the default value of the new parameter as data prefetch, which
means that if we are not using the fourth parameter, just like how we use
prefetch previously, it is still what it is.

The reason why I did the most of the testcase change is to make it looks more
completed at the parameter side. I could take back that change on adding
parameter in current testcases just add tests related to new parameter, which
is a minimal change to current test I suppose.

BRs,
Haochen

> 
> Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-20  1:39     ` Hongtao Liu
@ 2022-10-20  3:12       ` Hongtao Liu
  2022-10-20 18:44         ` Segher Boessenkool
  0 siblings, 1 reply; 26+ messages in thread
From: Hongtao Liu @ 2022-10-20  3:12 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Haochen Jiang, aoliva, richard.sandiford, uweigand, linkw, gnu,
	dje.gcc, olegendo, claziss, mfortune, gcc-patches, davem,
	dave.anglin, hubicka, richard.earnshaw, rguenther,
	marcus.shawcroft, ramana.radhakrishnan, hongtao.liu

On Thu, Oct 20, 2022 at 9:39 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Thu, Oct 20, 2022 at 5:08 AM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> >
> > On Fri, Oct 14, 2022 at 04:34:05PM +0800, Haochen Jiang wrote:
> > >       * config/s390/s390.cc (s390_expand_cpymem): Generate fourth parameter for
> >
> > (Many too long lines here, this is the first one.  Changelog lines are
> > max. 80 positions; a tab is eight).
> >
> > > +  /* Argument 3 must be either zero or one.  */
> > > +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> > > +    {
> > > +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> > > +     " using one");
> >
> > "using 1" makes sense maybe, but "using one" reads as "using an
> > argument", not very sane.
> >
> > An error would be better here anyway?
> >
> > > --- a/gcc/config/rs6000/rs6000.md
> > > +++ b/gcc/config/rs6000/rs6000.md
> > > @@ -14060,10 +14060,25 @@
> > >    DONE;
> > >  })
> > >
> > > -(define_insn "prefetch"
> > > +(define_expand "prefetch"
> > > +  [(prefetch (match_operand 0 "indexed_or_indirect_address")
> > > +          (match_operand:SI 1 "const_int_operand")
> > > +          (match_operand:SI 2 "const_int_operand")
> > > +          (match_operand:SI 3 "const_int_operand"))]
> > > +  ""
> > > +{
> > > +  if (INTVAL (operands[3]) == 0)
> > > +  {
> >
> > Broken indentation.
> >
> > > +    warning (0, "instruction prefetch is not supported; using data prefetch");
> >
> > Please use a separate pattern for this, and leave prefetch to mean data
> > prefetch, as documented!  Documentation you didn't change btw.  Call the
> > new one instruction_prefetch or something equally boring maybe :-)
Yes, Maybe we should add new rtl def named "iprefetch", so there will
be no need to change other backend.
> >
> > When you send an updated patch, please split it up better?  Generic
> > changes and documentation in one patch, target changes in a separate
> We'll split testcase into a separate patch.
> > patch or patches, and testsuite is distinct as well.  It isn't nice to
> > have to scroll through thousands of lines to see if there is anything
> > relevant to you.
>
> Yes, it's an inconvenience for review, sorry for that. But since we've
> changed rtl def for prefetch, moving the general part into a separate
> commit may break bootstrap when rtl-check is enabled.
> -DEF_RTL_EXPR(PREFETCH, "prefetch", "eee", RTX_EXTRA)
> +DEF_RTL_EXPR(PREFETCH, "prefetch", "eeee", RTX_EXTRA)
>
> And we want to make each commit pass the bootstrap and regression test.
> >
> > Thanks,
> >
> >
> > Segher
>
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] Support Intel prefetchit0/t1
  2022-10-14  8:34 ` [PATCH 2/2] Support Intel prefetchit0/t1 Haochen Jiang
@ 2022-10-20  3:45   ` H.J. Lu
  2022-10-20  4:04     ` Hongtao Liu
  0 siblings, 1 reply; 26+ messages in thread
From: H.J. Lu @ 2022-10-20  3:45 UTC (permalink / raw)
  To: Haochen Jiang
  Cc: gcc-patches, aoliva, richard.sandiford, uweigand, linkw, gnu,
	dje.gcc, olegendo, claziss, segher, mfortune, davem, dave.anglin,
	hubicka, richard.earnshaw, rguenther, marcus.shawcroft,
	ramana.radhakrishnan, hongtao.liu

On Fri, Oct 14, 2022 at 1:38 AM Haochen Jiang via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> gcc/ChangeLog:
>
>         * common/config/i386/cpuinfo.h (get_available_features):
>         Detect PREFETCHI.
>         * common/config/i386/i386-common.cc
>         (OPTION_MASK_ISA2_PREFETCHI_SET,
>         OPTION_MASK_ISA2_PREFETCHI_UNSET): New.
>         (ix86_handle_option): Handle -mprefetchi.
>         * common/config/i386/i386-cpuinfo.h (enum processor_features):
>         Add FEATURE_PREFETCHI.
>         * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
>         prefetchi.
>         * config.gcc: Add prfchiintrin.h.
>         * config/i386/cpuid.h (bit_PREFETCHI): New.
>         * config/i386/i386-c.cc (ix86_target_macros_internal): Define
>         __PREFETCHI__.
>         * config/i386/i386-isa.def (PREFETCHI): Add DEF_PTA(PREFETCHI).
>         * config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
>         Handle prefetchi.
>         * config/i386/i386.md (prefetch): Add handler for prefetchi
>         (*prefetch_i): New define_insn.
>         * config/i386/i386.opt: Add option -mprefetchi.
>         * config/i386/immintrin.h: Include prfchiintrin.h.
>         * config/i386/predicates.md (local_func_symbolic_operand):
>         New predicates.
>         * config/i386/xmmintrin.h (enum _mm_hint): New enum for prefetchi.
>         (_mm_prefetch): Handle the highest bit of enum.
>         * doc/extend.texi: Document prefetchi.
>         * doc/invoke.texi: Document -mprefetchi.
>         * doc/sourcebuild.texi: Document target prefetchi.
>         * config/i386/prfchiintrin.h: New file.
>
> gcc/testsuite/ChangeLog:
>
>         * g++.dg/other/i386-2.C: Add -mprefetchi.
>         * g++.dg/other/i386-3.C: Ditto.
>         * gcc.misc-tests/i386-pf-3dnow-1.c: Add scan-assembler-not for
>         prefetchit0/t1.
>         * gcc.misc-tests/i386-pf-athlon-1.c: Ditto.
>         * gcc.misc-tests/i386-pf-sse-1.c: Ditto.
>         * gcc.target/i386/avx-1.c: Add -mprefetchi.
>         * gcc.target/i386/avx-2.c: Ditto.
>         * gcc.target/i386/funcspec-56.inc: Add new target attribute.
>         * gcc.target/i386/prefetchi-1.c: Rewrite testcase.
>         * gcc.target/i386/prefetchi-2.c: New test.
>         * gcc.target/i386/prefetchi-3.c: Ditto.
>         * gcc.target/i386/sse-12.c: Add -mprefetchi.
>         * gcc.target/i386/sse-13.c: Ditto.
>         * gcc.target/i386/sse-14.c: Ditto.
>         * gcc.target/i386/sse-22.c: Add prefetchi.
>         * gcc.target/i386/sse-23.c: Ditto.
>
> Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
> ---
>  gcc/common/config/i386/cpuinfo.h              |  2 +
>  gcc/common/config/i386/i386-common.cc         | 15 ++++
>  gcc/common/config/i386/i386-cpuinfo.h         |  1 +
>  gcc/common/config/i386/i386-isas.h            |  1 +
>  gcc/config.gcc                                |  2 +-
>  gcc/config/i386/cpuid.h                       |  1 +
>  gcc/config/i386/i386-c.cc                     |  2 +
>  gcc/config/i386/i386-isa.def                  |  1 +
>  gcc/config/i386/i386-options.cc               |  4 +-
>  gcc/config/i386/i386.md                       | 90 +++++++++++++------
>  gcc/config/i386/i386.opt                      |  4 +
>  gcc/config/i386/immintrin.h                   |  2 +
>  gcc/config/i386/predicates.md                 | 15 ++++
>  gcc/config/i386/prfchiintrin.h                | 39 ++++++++
>  gcc/config/i386/xmmintrin.h                   |  6 +-
>  gcc/doc/extend.texi                           |  5 ++
>  gcc/doc/invoke.texi                           | 10 ++-
>  gcc/doc/sourcebuild.texi                      |  3 +
>  gcc/testsuite/g++.dg/other/i386-2.C           |  2 +-
>  gcc/testsuite/g++.dg/other/i386-3.C           |  2 +-
>  .../gcc.misc-tests/i386-pf-3dnow-1.c          |  2 +
>  .../gcc.misc-tests/i386-pf-athlon-1.c         |  2 +
>  gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c  |  2 +
>  gcc/testsuite/gcc.target/i386/avx-1.c         |  2 +-
>  gcc/testsuite/gcc.target/i386/avx-2.c         |  2 +-
>  gcc/testsuite/gcc.target/i386/funcspec-56.inc |  2 +
>  gcc/testsuite/gcc.target/i386/prefetchi-1.c   | 36 ++++++--
>  gcc/testsuite/gcc.target/i386/prefetchi-2.c   | 26 ++++++
>  gcc/testsuite/gcc.target/i386/prefetchi-3.c   | 15 ++++
>  gcc/testsuite/gcc.target/i386/sse-12.c        |  2 +-
>  gcc/testsuite/gcc.target/i386/sse-13.c        |  2 +-
>  gcc/testsuite/gcc.target/i386/sse-14.c        |  2 +-
>  gcc/testsuite/gcc.target/i386/sse-22.c        |  4 +-
>  gcc/testsuite/gcc.target/i386/sse-23.c        |  2 +-
>  34 files changed, 259 insertions(+), 49 deletions(-)
>  create mode 100644 gcc/config/i386/prfchiintrin.h
>  create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-3.c
>
> diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
> index 118f3a42abd..551e0483330 100644
> --- a/gcc/common/config/i386/cpuinfo.h
> +++ b/gcc/common/config/i386/cpuinfo.h
> @@ -797,6 +797,8 @@ get_available_features (struct __processor_model *cpu_model,
>         set_feature (FEATURE_HRESET);
>        if (eax & bit_CMPCCXADD)
>         set_feature(FEATURE_CMPCCXADD);
> +      if (edx & bit_PREFETCHI)
> +       set_feature (FEATURE_PREFETCHI);
>        if (avx_usable)
>         {
>           if (eax & bit_AVXVNNI)
> diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i386/i386-common.cc
> index f3d00ce4bc9..77ff07a3797 100644
> --- a/gcc/common/config/i386/i386-common.cc
> +++ b/gcc/common/config/i386/i386-common.cc
> @@ -112,6 +112,7 @@ along with GCC; see the file COPYING3.  If not see
>  #define OPTION_MASK_ISA2_AVXNECONVERT_SET OPTION_MASK_ISA2_AVXNECONVERT
>  #define OPTION_MASK_ISA2_CMPCCXADD_SET OPTION_MASK_ISA2_CMPCCXADD
>  #define OPTION_MASK_ISA2_AMX_FP16_SET OPTION_MASK_ISA2_AMX_FP16
> +#define OPTION_MASK_ISA2_PREFETCHI_SET OPTION_MASK_ISA2_PREFETCHI
>
>  /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
>     as -msse4.2.  */
> @@ -287,6 +288,7 @@ along with GCC; see the file COPYING3.  If not see
>  #define OPTION_MASK_ISA2_AVXNECONVERT_UNSET OPTION_MASK_ISA2_AVXNECONVERT
>  #define OPTION_MASK_ISA2_CMPCCXADD_UNSET OPTION_MASK_ISA2_CMPCCXADD
>  #define OPTION_MASK_ISA2_AMX_FP16_UNSET OPTION_MASK_ISA2_AMX_FP16
> +#define OPTION_MASK_ISA2_PREFETCHI_UNSET OPTION_MASK_ISA2_PREFETCHI
>
>  /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
>     as -mno-sse4.1. */
> @@ -1211,6 +1213,19 @@ ix86_handle_option (struct gcc_options *opts,
>         }
>        return true;
>
> +    case OPT_mprefetchi:
> +      if (value)
> +       {
> +         opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_PREFETCHI_SET;
> +         opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_PREFETCHI_SET;
> +       }
> +      else
> +       {
> +         opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_PREFETCHI_UNSET;
> +         opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_PREFETCHI_UNSET;
> +       }
> +      return true;
> +
>      case OPT_mfma:
>        if (value)
>         {
> diff --git a/gcc/common/config/i386/i386-cpuinfo.h b/gcc/common/config/i386/i386-cpuinfo.h
> index f9d5b7238ea..3fe69178841 100644
> --- a/gcc/common/config/i386/i386-cpuinfo.h
> +++ b/gcc/common/config/i386/i386-cpuinfo.h
> @@ -246,6 +246,7 @@ enum processor_features
>    FEATURE_AVXNECONVERT,
>    FEATURE_CMPCCXADD,
>    FEATURE_AMX_FP16,
> +  FEATURE_PREFETCHI,
>    CPU_FEATURE_MAX
>  };
>
> diff --git a/gcc/common/config/i386/i386-isas.h b/gcc/common/config/i386/i386-isas.h
> index 7c4a71413b5..8648ea6903c 100644
> --- a/gcc/common/config/i386/i386-isas.h
> +++ b/gcc/common/config/i386/i386-isas.h
> @@ -182,4 +182,5 @@ ISA_NAMES_TABLE_START
>                         P_NONE, "-mavxneconvert")
>    ISA_NAMES_TABLE_ENTRY("cmpccxadd", FEATURE_CMPCCXADD, P_NONE, "-mcmpccxadd")
>    ISA_NAMES_TABLE_ENTRY("amx-fp16", FEATURE_AMX_FP16, P_NONE, "-mamx-fp16")
> +  ISA_NAMES_TABLE_ENTRY("prefetchi", FEATURE_PREFETCHI, P_NONE, "-mprefetchi")
>  ISA_NAMES_TABLE_END
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 8a8712d1466..ceea7726bfd 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -423,7 +423,7 @@ i[34567]86-*-* | x86_64-*-*)
>                        hresetintrin.h keylockerintrin.h avxvnniintrin.h
>                        mwaitintrin.h avx512fp16intrin.h avx512fp16vlintrin.h
>                        avxifmaintrin.h avxvnniint8intrin.h avxneconvertintrin.h
> -                      cmpccxaddintrin.h amxfp16intrin.h"
> +                      cmpccxaddintrin.h amxfp16intrin.h prfchiintrin.h"
>         ;;
>  ia64-*-*)
>         extra_headers=ia64intrin.h
> diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
> index 229c15c5950..92583261883 100644
> --- a/gcc/config/i386/cpuid.h
> +++ b/gcc/config/i386/cpuid.h
> @@ -54,6 +54,7 @@
>  #define bit_AVXVNNIINT8 (1 << 4)
>  #define bit_AVXNECONVERT (1 << 5)
>  #define bit_CMPXCHG8B  (1 << 8)
> +#define bit_PREFETCHI  (1 << 14)
>  #define bit_CMOV       (1 << 15)
>  #define bit_MMX                (1 << 23)
>  #define bit_FXSAVE     (1 << 24)
> diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc
> index 3020b5f267a..74239002ed6 100644
> --- a/gcc/config/i386/i386-c.cc
> +++ b/gcc/config/i386/i386-c.cc
> @@ -650,6 +650,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
>      def_or_undef (parse_in, "__CMPCCXADD__");
>    if (isa_flag2 & OPTION_MASK_ISA2_AMX_FP16)
>      def_or_undef (parse_in, "__AMX_FP16__");
> +  if (isa_flag2 & OPTION_MASK_ISA2_PREFETCHI)
> +    def_or_undef (parse_in, "__PREFETCHI__");
>    if (TARGET_IAMCU)
>      {
>        def_or_undef (parse_in, "__iamcu");
> diff --git a/gcc/config/i386/i386-isa.def b/gcc/config/i386/i386-isa.def
> index 55b25763957..f234dcc37d7 100644
> --- a/gcc/config/i386/i386-isa.def
> +++ b/gcc/config/i386/i386-isa.def
> @@ -114,3 +114,4 @@ DEF_PTA(AVXVNNIINT8)
>  DEF_PTA(AVXNECONVERT)
>  DEF_PTA(CMPCCXADD)
>  DEF_PTA(AMX_FP16)
> +DEF_PTA(PREFETCHI)
> diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc
> index bf37c77589e..3f98b09e5cf 100644
> --- a/gcc/config/i386/i386-options.cc
> +++ b/gcc/config/i386/i386-options.cc
> @@ -232,7 +232,8 @@ static struct ix86_target_opts isa2_opts[] =
>    { "-mavxvnniint8",   OPTION_MASK_ISA2_AVXVNNIINT8 },
>    { "-mavxneconvert",   OPTION_MASK_ISA2_AVXNECONVERT },
>    { "-mcmpccxadd",      OPTION_MASK_ISA2_CMPCCXADD },
> -  { "-mamx-fp16",       OPTION_MASK_ISA2_AMX_FP16 }
> +  { "-mamx-fp16",       OPTION_MASK_ISA2_AMX_FP16 },
> +  { "-mprefetchi",      OPTION_MASK_ISA2_PREFETCHI }
>  };
>  static struct ix86_target_opts isa_opts[] =
>  {
> @@ -1084,6 +1085,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
>      IX86_ATTR_ISA ("avxneconvert",   OPT_mavxneconvert),
>      IX86_ATTR_ISA ("cmpccxadd",   OPT_mcmpccxadd),
>      IX86_ATTR_ISA ("amx-fp16", OPT_mamx_fp16),
> +    IX86_ATTR_ISA ("prefetchi",   OPT_mprefetchi),
>
>      /* enum options */
>      IX86_ATTR_ENUM ("fpmath=", OPT_mfpmath_),
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index c65cf14b9f4..fb75f57483b 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -23637,47 +23637,65 @@
>              (match_operand:SI 1 "const_int_operand")
>              (match_operand:SI 2 "const_int_operand")
>              (match_operand:SI 3 "const_int_operand"))]
> -  "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1"
> +  "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1
> +  || TARGET_PREFETCHI"
>  {
> -  if (INTVAL (operands[3]) == 0)
> -  {
> -    warning (0, "instruction prefetch is not supported; using data prefetch");
> -    operands[3] = const1_rtx;
> -  }
>    bool write = operands[1] != const0_rtx;
>    int locality = INTVAL (operands[2]);
> +  bool data = operands[3] != const0_rtx;
>
>    gcc_assert (IN_RANGE (locality, 0, 3));
>
> -  /* Use 3dNOW prefetch in case we are asking for write prefetch not
> -     supported by SSE counterpart (non-SSE2 athlon machines) or the
> -     SSE prefetch is not available (K6 machines).  Otherwise use SSE
> -     prefetch as it allows specifying of locality.  */
> -
> -  if (write)
> +  if (data)
>      {
> -      if (TARGET_PREFETCHWT1)
> -       operands[2] = GEN_INT (MAX (locality, 2));
> -      else if (TARGET_PRFCHW)
> -       operands[2] = GEN_INT (3);
> -      else if (TARGET_3DNOW && !TARGET_SSE2)
> -       operands[2] = GEN_INT (3);
> -      else if (TARGET_PREFETCH_SSE)
> -       operands[1] = const0_rtx;
> +      /* Use 3dNOW prefetch in case we are asking for write prefetch not
> +        supported by SSE counterpart (non-SSE2 athlon machines) or the
> +        SSE prefetch is not available (K6 machines).  Otherwise use SSE
> +        prefetch as it allows specifying of locality.  */
> +
> +      if (write)
> +       {
> +         if (TARGET_PREFETCHWT1)
> +           operands[2] = GEN_INT (MAX (locality, 2));
> +         else if (TARGET_PRFCHW)
> +           operands[2] = GEN_INT (3);
> +         else if (TARGET_3DNOW && !TARGET_SSE2)
> +           operands[2] = GEN_INT (3);
> +         else if (TARGET_PREFETCH_SSE)
> +           operands[1] = const0_rtx;
> +         else
> +           {
> +             gcc_assert (TARGET_3DNOW);
> +             operands[2] = GEN_INT (3);
> +           }
> +       }
>        else
>         {
> -         gcc_assert (TARGET_3DNOW);
> -         operands[2] = GEN_INT (3);
> +         if (TARGET_PREFETCH_SSE)
> +           ;
> +         else
> +           {
> +             gcc_assert (TARGET_3DNOW);
> +             operands[2] = GEN_INT (3);
> +           }
>         }
>      }
>    else
>      {
> -      if (TARGET_PREFETCH_SSE)
> +      /* GOT/PLT_PIC should not be available for instruction prefetch.
> +        It must be real instruction address.  */
> +      if (TARGET_PREFETCHI && TARGET_64BIT
> +        && local_func_symbolic_operand (operands[0], GET_MODE (operands[0])))
>         ;
>        else
>         {
> -         gcc_assert (TARGET_3DNOW);
> -         operands[2] = GEN_INT (3);
> +         /* Ignore the hint.  */
> +         warning (0, "instruction prefetch applies when in 64-bit mode"
> +                     " with RIP-relative addressing and"
> +                     " option %<-mprefetchi%>;"
> +                     " they stay NOPs otherwise");
> +         emit_insn (gen_nop ());
> +         DONE;
>         }
>      }
>  })
> @@ -23733,6 +23751,28 @@
>         (symbol_ref "memory_address_length (operands[0], false)"))
>     (set_attr "memory" "none")])
>
> +(define_insn "*prefetch_i"
> +  [(prefetch (match_operand 0 "local_func_symbolic_operand" "p")
> +            (const_int 0)
> +            (match_operand:SI 1 "const_int_operand")
> +            (const_int 0))]
> +  "TARGET_PREFETCHI"
> +{
> +  static const char * const patterns[2] = {
> +    "prefetchit1\t%a0", "prefetchit0\t%a0"
> +  };
> +
> +  int locality = INTVAL (operands[1]);
> +  gcc_assert (IN_RANGE (locality, 2, 3));
> +
> +  return patterns[locality - 2];
> +}
> +  [(set_attr "type" "sse")
> +   (set_attr "atom_sse_attr" "prefetch")
> +   (set (attr "length_address")
> +       (symbol_ref "memory_address_length (operands[0], false)"))
> +   (set_attr "memory" "none")])
> +
>  (define_expand "stack_protect_set"
>    [(match_operand 0 "memory_operand")
>     (match_operand 1 "memory_operand")]
> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> index eaa43946341..1d91103cd54 100644
> --- a/gcc/config/i386/i386.opt
> +++ b/gcc/config/i386/i386.opt
> @@ -1238,3 +1238,7 @@ CMPCCXADD build-in functions and code generation.
>  mamx-fp16
>  Target Mask(ISA2_AMX_FP16) Var(ix86_isa_flags2) Save
>  Support AMX-FP16 built-in functions and code generation.
> +
> +mprefetchi
> +Target Mask(ISA2_PREFETCHI) Var(ix86_isa_flags2) Save
> +Support PREFETCHI built-in functions and code generation.
> diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
> index d8415863f52..ac6402653e0 100644
> --- a/gcc/config/i386/immintrin.h
> +++ b/gcc/config/i386/immintrin.h
> @@ -134,6 +134,8 @@
>
>  #include <amxbf16intrin.h>
>
> +#include <prfchiintrin.h>
> +
>  #include <prfchwintrin.h>
>
>  #include <keylockerintrin.h>
> diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
> index c4141a96735..2a3f07224cc 100644
> --- a/gcc/config/i386/predicates.md
> +++ b/gcc/config/i386/predicates.md
> @@ -610,6 +610,21 @@
>    return false;
>  })
>
> +(define_predicate "local_func_symbolic_operand"
> +  (match_operand 0 "local_symbolic_operand")
> +{
> +  if (GET_CODE (op) == CONST
> +      && GET_CODE (XEXP (op, 0)) == PLUS
> +      && CONST_INT_P (XEXP (XEXP (op, 0), 1)))
> +    op = XEXP (XEXP (op, 0), 0);
> +
> +  if (GET_CODE (op) == SYMBOL_REF
> +      && !SYMBOL_REF_FUNCTION_P (op))
> +    return false;
> +
> +  return true;
> +})

Will it return true for any memory address?  I think we should
support code label and check for SYMBOL_REF_LOCAL_P.

-- 
H.J.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/2] Support Intel prefetchit0/t1
  2022-10-20  3:45   ` H.J. Lu
@ 2022-10-20  4:04     ` Hongtao Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Hongtao Liu @ 2022-10-20  4:04 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Haochen Jiang, mfortune, dave.anglin, rguenther, segher, aoliva,
	richard.sandiford, uweigand, marcus.shawcroft, olegendo,
	gcc-patches, linkw, richard.earnshaw, ramana.radhakrishnan,
	davem, gnu, hongtao.liu, claziss, hubicka, dje.gcc

On Thu, Oct 20, 2022 at 11:46 AM H.J. Lu via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Fri, Oct 14, 2022 at 1:38 AM Haochen Jiang via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > gcc/ChangeLog:
> >
> >         * common/config/i386/cpuinfo.h (get_available_features):
> >         Detect PREFETCHI.
> >         * common/config/i386/i386-common.cc
> >         (OPTION_MASK_ISA2_PREFETCHI_SET,
> >         OPTION_MASK_ISA2_PREFETCHI_UNSET): New.
> >         (ix86_handle_option): Handle -mprefetchi.
> >         * common/config/i386/i386-cpuinfo.h (enum processor_features):
> >         Add FEATURE_PREFETCHI.
> >         * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
> >         prefetchi.
> >         * config.gcc: Add prfchiintrin.h.
> >         * config/i386/cpuid.h (bit_PREFETCHI): New.
> >         * config/i386/i386-c.cc (ix86_target_macros_internal): Define
> >         __PREFETCHI__.
> >         * config/i386/i386-isa.def (PREFETCHI): Add DEF_PTA(PREFETCHI).
> >         * config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p):
> >         Handle prefetchi.
> >         * config/i386/i386.md (prefetch): Add handler for prefetchi
> >         (*prefetch_i): New define_insn.
> >         * config/i386/i386.opt: Add option -mprefetchi.
> >         * config/i386/immintrin.h: Include prfchiintrin.h.
> >         * config/i386/predicates.md (local_func_symbolic_operand):
> >         New predicates.
> >         * config/i386/xmmintrin.h (enum _mm_hint): New enum for prefetchi.
> >         (_mm_prefetch): Handle the highest bit of enum.
> >         * doc/extend.texi: Document prefetchi.
> >         * doc/invoke.texi: Document -mprefetchi.
> >         * doc/sourcebuild.texi: Document target prefetchi.
> >         * config/i386/prfchiintrin.h: New file.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * g++.dg/other/i386-2.C: Add -mprefetchi.
> >         * g++.dg/other/i386-3.C: Ditto.
> >         * gcc.misc-tests/i386-pf-3dnow-1.c: Add scan-assembler-not for
> >         prefetchit0/t1.
> >         * gcc.misc-tests/i386-pf-athlon-1.c: Ditto.
> >         * gcc.misc-tests/i386-pf-sse-1.c: Ditto.
> >         * gcc.target/i386/avx-1.c: Add -mprefetchi.
> >         * gcc.target/i386/avx-2.c: Ditto.
> >         * gcc.target/i386/funcspec-56.inc: Add new target attribute.
> >         * gcc.target/i386/prefetchi-1.c: Rewrite testcase.
> >         * gcc.target/i386/prefetchi-2.c: New test.
> >         * gcc.target/i386/prefetchi-3.c: Ditto.
> >         * gcc.target/i386/sse-12.c: Add -mprefetchi.
> >         * gcc.target/i386/sse-13.c: Ditto.
> >         * gcc.target/i386/sse-14.c: Ditto.
> >         * gcc.target/i386/sse-22.c: Add prefetchi.
> >         * gcc.target/i386/sse-23.c: Ditto.
> >
> > Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
> > ---
> >  gcc/common/config/i386/cpuinfo.h              |  2 +
> >  gcc/common/config/i386/i386-common.cc         | 15 ++++
> >  gcc/common/config/i386/i386-cpuinfo.h         |  1 +
> >  gcc/common/config/i386/i386-isas.h            |  1 +
> >  gcc/config.gcc                                |  2 +-
> >  gcc/config/i386/cpuid.h                       |  1 +
> >  gcc/config/i386/i386-c.cc                     |  2 +
> >  gcc/config/i386/i386-isa.def                  |  1 +
> >  gcc/config/i386/i386-options.cc               |  4 +-
> >  gcc/config/i386/i386.md                       | 90 +++++++++++++------
> >  gcc/config/i386/i386.opt                      |  4 +
> >  gcc/config/i386/immintrin.h                   |  2 +
> >  gcc/config/i386/predicates.md                 | 15 ++++
> >  gcc/config/i386/prfchiintrin.h                | 39 ++++++++
> >  gcc/config/i386/xmmintrin.h                   |  6 +-
> >  gcc/doc/extend.texi                           |  5 ++
> >  gcc/doc/invoke.texi                           | 10 ++-
> >  gcc/doc/sourcebuild.texi                      |  3 +
> >  gcc/testsuite/g++.dg/other/i386-2.C           |  2 +-
> >  gcc/testsuite/g++.dg/other/i386-3.C           |  2 +-
> >  .../gcc.misc-tests/i386-pf-3dnow-1.c          |  2 +
> >  .../gcc.misc-tests/i386-pf-athlon-1.c         |  2 +
> >  gcc/testsuite/gcc.misc-tests/i386-pf-sse-1.c  |  2 +
> >  gcc/testsuite/gcc.target/i386/avx-1.c         |  2 +-
> >  gcc/testsuite/gcc.target/i386/avx-2.c         |  2 +-
> >  gcc/testsuite/gcc.target/i386/funcspec-56.inc |  2 +
> >  gcc/testsuite/gcc.target/i386/prefetchi-1.c   | 36 ++++++--
> >  gcc/testsuite/gcc.target/i386/prefetchi-2.c   | 26 ++++++
> >  gcc/testsuite/gcc.target/i386/prefetchi-3.c   | 15 ++++
> >  gcc/testsuite/gcc.target/i386/sse-12.c        |  2 +-
> >  gcc/testsuite/gcc.target/i386/sse-13.c        |  2 +-
> >  gcc/testsuite/gcc.target/i386/sse-14.c        |  2 +-
> >  gcc/testsuite/gcc.target/i386/sse-22.c        |  4 +-
> >  gcc/testsuite/gcc.target/i386/sse-23.c        |  2 +-
> >  34 files changed, 259 insertions(+), 49 deletions(-)
> >  create mode 100644 gcc/config/i386/prfchiintrin.h
> >  create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-3.c
> >
> > diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
> > index 118f3a42abd..551e0483330 100644
> > --- a/gcc/common/config/i386/cpuinfo.h
> > +++ b/gcc/common/config/i386/cpuinfo.h
> > @@ -797,6 +797,8 @@ get_available_features (struct __processor_model *cpu_model,
> >         set_feature (FEATURE_HRESET);
> >        if (eax & bit_CMPCCXADD)
> >         set_feature(FEATURE_CMPCCXADD);
> > +      if (edx & bit_PREFETCHI)
> > +       set_feature (FEATURE_PREFETCHI);
> >        if (avx_usable)
> >         {
> >           if (eax & bit_AVXVNNI)
> > diff --git a/gcc/common/config/i386/i386-common.cc b/gcc/common/config/i386/i386-common.cc
> > index f3d00ce4bc9..77ff07a3797 100644
> > --- a/gcc/common/config/i386/i386-common.cc
> > +++ b/gcc/common/config/i386/i386-common.cc
> > @@ -112,6 +112,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #define OPTION_MASK_ISA2_AVXNECONVERT_SET OPTION_MASK_ISA2_AVXNECONVERT
> >  #define OPTION_MASK_ISA2_CMPCCXADD_SET OPTION_MASK_ISA2_CMPCCXADD
> >  #define OPTION_MASK_ISA2_AMX_FP16_SET OPTION_MASK_ISA2_AMX_FP16
> > +#define OPTION_MASK_ISA2_PREFETCHI_SET OPTION_MASK_ISA2_PREFETCHI
> >
> >  /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
> >     as -msse4.2.  */
> > @@ -287,6 +288,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #define OPTION_MASK_ISA2_AVXNECONVERT_UNSET OPTION_MASK_ISA2_AVXNECONVERT
> >  #define OPTION_MASK_ISA2_CMPCCXADD_UNSET OPTION_MASK_ISA2_CMPCCXADD
> >  #define OPTION_MASK_ISA2_AMX_FP16_UNSET OPTION_MASK_ISA2_AMX_FP16
> > +#define OPTION_MASK_ISA2_PREFETCHI_UNSET OPTION_MASK_ISA2_PREFETCHI
> >
> >  /* SSE4 includes both SSE4.1 and SSE4.2.  -mno-sse4 should the same
> >     as -mno-sse4.1. */
> > @@ -1211,6 +1213,19 @@ ix86_handle_option (struct gcc_options *opts,
> >         }
> >        return true;
> >
> > +    case OPT_mprefetchi:
> > +      if (value)
> > +       {
> > +         opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_PREFETCHI_SET;
> > +         opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_PREFETCHI_SET;
> > +       }
> > +      else
> > +       {
> > +         opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_PREFETCHI_UNSET;
> > +         opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_PREFETCHI_UNSET;
> > +       }
> > +      return true;
> > +
> >      case OPT_mfma:
> >        if (value)
> >         {
> > diff --git a/gcc/common/config/i386/i386-cpuinfo.h b/gcc/common/config/i386/i386-cpuinfo.h
> > index f9d5b7238ea..3fe69178841 100644
> > --- a/gcc/common/config/i386/i386-cpuinfo.h
> > +++ b/gcc/common/config/i386/i386-cpuinfo.h
> > @@ -246,6 +246,7 @@ enum processor_features
> >    FEATURE_AVXNECONVERT,
> >    FEATURE_CMPCCXADD,
> >    FEATURE_AMX_FP16,
> > +  FEATURE_PREFETCHI,
> >    CPU_FEATURE_MAX
> >  };
> >
> > diff --git a/gcc/common/config/i386/i386-isas.h b/gcc/common/config/i386/i386-isas.h
> > index 7c4a71413b5..8648ea6903c 100644
> > --- a/gcc/common/config/i386/i386-isas.h
> > +++ b/gcc/common/config/i386/i386-isas.h
> > @@ -182,4 +182,5 @@ ISA_NAMES_TABLE_START
> >                         P_NONE, "-mavxneconvert")
> >    ISA_NAMES_TABLE_ENTRY("cmpccxadd", FEATURE_CMPCCXADD, P_NONE, "-mcmpccxadd")
> >    ISA_NAMES_TABLE_ENTRY("amx-fp16", FEATURE_AMX_FP16, P_NONE, "-mamx-fp16")
> > +  ISA_NAMES_TABLE_ENTRY("prefetchi", FEATURE_PREFETCHI, P_NONE, "-mprefetchi")
> >  ISA_NAMES_TABLE_END
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index 8a8712d1466..ceea7726bfd 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -423,7 +423,7 @@ i[34567]86-*-* | x86_64-*-*)
> >                        hresetintrin.h keylockerintrin.h avxvnniintrin.h
> >                        mwaitintrin.h avx512fp16intrin.h avx512fp16vlintrin.h
> >                        avxifmaintrin.h avxvnniint8intrin.h avxneconvertintrin.h
> > -                      cmpccxaddintrin.h amxfp16intrin.h"
> > +                      cmpccxaddintrin.h amxfp16intrin.h prfchiintrin.h"
> >         ;;
> >  ia64-*-*)
> >         extra_headers=ia64intrin.h
> > diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h
> > index 229c15c5950..92583261883 100644
> > --- a/gcc/config/i386/cpuid.h
> > +++ b/gcc/config/i386/cpuid.h
> > @@ -54,6 +54,7 @@
> >  #define bit_AVXVNNIINT8 (1 << 4)
> >  #define bit_AVXNECONVERT (1 << 5)
> >  #define bit_CMPXCHG8B  (1 << 8)
> > +#define bit_PREFETCHI  (1 << 14)
> >  #define bit_CMOV       (1 << 15)
> >  #define bit_MMX                (1 << 23)
> >  #define bit_FXSAVE     (1 << 24)
> > diff --git a/gcc/config/i386/i386-c.cc b/gcc/config/i386/i386-c.cc
> > index 3020b5f267a..74239002ed6 100644
> > --- a/gcc/config/i386/i386-c.cc
> > +++ b/gcc/config/i386/i386-c.cc
> > @@ -650,6 +650,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
> >      def_or_undef (parse_in, "__CMPCCXADD__");
> >    if (isa_flag2 & OPTION_MASK_ISA2_AMX_FP16)
> >      def_or_undef (parse_in, "__AMX_FP16__");
> > +  if (isa_flag2 & OPTION_MASK_ISA2_PREFETCHI)
> > +    def_or_undef (parse_in, "__PREFETCHI__");
> >    if (TARGET_IAMCU)
> >      {
> >        def_or_undef (parse_in, "__iamcu");
> > diff --git a/gcc/config/i386/i386-isa.def b/gcc/config/i386/i386-isa.def
> > index 55b25763957..f234dcc37d7 100644
> > --- a/gcc/config/i386/i386-isa.def
> > +++ b/gcc/config/i386/i386-isa.def
> > @@ -114,3 +114,4 @@ DEF_PTA(AVXVNNIINT8)
> >  DEF_PTA(AVXNECONVERT)
> >  DEF_PTA(CMPCCXADD)
> >  DEF_PTA(AMX_FP16)
> > +DEF_PTA(PREFETCHI)
> > diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc
> > index bf37c77589e..3f98b09e5cf 100644
> > --- a/gcc/config/i386/i386-options.cc
> > +++ b/gcc/config/i386/i386-options.cc
> > @@ -232,7 +232,8 @@ static struct ix86_target_opts isa2_opts[] =
> >    { "-mavxvnniint8",   OPTION_MASK_ISA2_AVXVNNIINT8 },
> >    { "-mavxneconvert",   OPTION_MASK_ISA2_AVXNECONVERT },
> >    { "-mcmpccxadd",      OPTION_MASK_ISA2_CMPCCXADD },
> > -  { "-mamx-fp16",       OPTION_MASK_ISA2_AMX_FP16 }
> > +  { "-mamx-fp16",       OPTION_MASK_ISA2_AMX_FP16 },
> > +  { "-mprefetchi",      OPTION_MASK_ISA2_PREFETCHI }
> >  };
> >  static struct ix86_target_opts isa_opts[] =
> >  {
> > @@ -1084,6 +1085,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree args, char *p_strings[],
> >      IX86_ATTR_ISA ("avxneconvert",   OPT_mavxneconvert),
> >      IX86_ATTR_ISA ("cmpccxadd",   OPT_mcmpccxadd),
> >      IX86_ATTR_ISA ("amx-fp16", OPT_mamx_fp16),
> > +    IX86_ATTR_ISA ("prefetchi",   OPT_mprefetchi),
> >
> >      /* enum options */
> >      IX86_ATTR_ENUM ("fpmath=", OPT_mfpmath_),
> > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> > index c65cf14b9f4..fb75f57483b 100644
> > --- a/gcc/config/i386/i386.md
> > +++ b/gcc/config/i386/i386.md
> > @@ -23637,47 +23637,65 @@
> >              (match_operand:SI 1 "const_int_operand")
> >              (match_operand:SI 2 "const_int_operand")
> >              (match_operand:SI 3 "const_int_operand"))]
> > -  "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1"
> > +  "TARGET_3DNOW || TARGET_PREFETCH_SSE || TARGET_PRFCHW || TARGET_PREFETCHWT1
> > +  || TARGET_PREFETCHI"
> >  {
> > -  if (INTVAL (operands[3]) == 0)
> > -  {
> > -    warning (0, "instruction prefetch is not supported; using data prefetch");
> > -    operands[3] = const1_rtx;
> > -  }
> >    bool write = operands[1] != const0_rtx;
> >    int locality = INTVAL (operands[2]);
> > +  bool data = operands[3] != const0_rtx;
> >
> >    gcc_assert (IN_RANGE (locality, 0, 3));
> >
> > -  /* Use 3dNOW prefetch in case we are asking for write prefetch not
> > -     supported by SSE counterpart (non-SSE2 athlon machines) or the
> > -     SSE prefetch is not available (K6 machines).  Otherwise use SSE
> > -     prefetch as it allows specifying of locality.  */
> > -
> > -  if (write)
> > +  if (data)
> >      {
> > -      if (TARGET_PREFETCHWT1)
> > -       operands[2] = GEN_INT (MAX (locality, 2));
> > -      else if (TARGET_PRFCHW)
> > -       operands[2] = GEN_INT (3);
> > -      else if (TARGET_3DNOW && !TARGET_SSE2)
> > -       operands[2] = GEN_INT (3);
> > -      else if (TARGET_PREFETCH_SSE)
> > -       operands[1] = const0_rtx;
> > +      /* Use 3dNOW prefetch in case we are asking for write prefetch not
> > +        supported by SSE counterpart (non-SSE2 athlon machines) or the
> > +        SSE prefetch is not available (K6 machines).  Otherwise use SSE
> > +        prefetch as it allows specifying of locality.  */
> > +
> > +      if (write)
> > +       {
> > +         if (TARGET_PREFETCHWT1)
> > +           operands[2] = GEN_INT (MAX (locality, 2));
> > +         else if (TARGET_PRFCHW)
> > +           operands[2] = GEN_INT (3);
> > +         else if (TARGET_3DNOW && !TARGET_SSE2)
> > +           operands[2] = GEN_INT (3);
> > +         else if (TARGET_PREFETCH_SSE)
> > +           operands[1] = const0_rtx;
> > +         else
> > +           {
> > +             gcc_assert (TARGET_3DNOW);
> > +             operands[2] = GEN_INT (3);
> > +           }
> > +       }
> >        else
> >         {
> > -         gcc_assert (TARGET_3DNOW);
> > -         operands[2] = GEN_INT (3);
> > +         if (TARGET_PREFETCH_SSE)
> > +           ;
> > +         else
> > +           {
> > +             gcc_assert (TARGET_3DNOW);
> > +             operands[2] = GEN_INT (3);
> > +           }
> >         }
> >      }
> >    else
> >      {
> > -      if (TARGET_PREFETCH_SSE)
> > +      /* GOT/PLT_PIC should not be available for instruction prefetch.
> > +        It must be real instruction address.  */
> > +      if (TARGET_PREFETCHI && TARGET_64BIT
> > +        && local_func_symbolic_operand (operands[0], GET_MODE (operands[0])))
> >         ;
> >        else
> >         {
> > -         gcc_assert (TARGET_3DNOW);
> > -         operands[2] = GEN_INT (3);
> > +         /* Ignore the hint.  */
> > +         warning (0, "instruction prefetch applies when in 64-bit mode"
> > +                     " with RIP-relative addressing and"
> > +                     " option %<-mprefetchi%>;"
> > +                     " they stay NOPs otherwise");
> > +         emit_insn (gen_nop ());
> > +         DONE;
> >         }
> >      }
> >  })
> > @@ -23733,6 +23751,28 @@
> >         (symbol_ref "memory_address_length (operands[0], false)"))
> >     (set_attr "memory" "none")])
> >
> > +(define_insn "*prefetch_i"
> > +  [(prefetch (match_operand 0 "local_func_symbolic_operand" "p")
> > +            (const_int 0)
> > +            (match_operand:SI 1 "const_int_operand")
> > +            (const_int 0))]
> > +  "TARGET_PREFETCHI"
> > +{
> > +  static const char * const patterns[2] = {
> > +    "prefetchit1\t%a0", "prefetchit0\t%a0"
> > +  };
> > +
> > +  int locality = INTVAL (operands[1]);
> > +  gcc_assert (IN_RANGE (locality, 2, 3));
> > +
> > +  return patterns[locality - 2];
> > +}
> > +  [(set_attr "type" "sse")
> > +   (set_attr "atom_sse_attr" "prefetch")
> > +   (set (attr "length_address")
> > +       (symbol_ref "memory_address_length (operands[0], false)"))
> > +   (set_attr "memory" "none")])
> > +
> >  (define_expand "stack_protect_set"
> >    [(match_operand 0 "memory_operand")
> >     (match_operand 1 "memory_operand")]
> > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> > index eaa43946341..1d91103cd54 100644
> > --- a/gcc/config/i386/i386.opt
> > +++ b/gcc/config/i386/i386.opt
> > @@ -1238,3 +1238,7 @@ CMPCCXADD build-in functions and code generation.
> >  mamx-fp16
> >  Target Mask(ISA2_AMX_FP16) Var(ix86_isa_flags2) Save
> >  Support AMX-FP16 built-in functions and code generation.
> > +
> > +mprefetchi
> > +Target Mask(ISA2_PREFETCHI) Var(ix86_isa_flags2) Save
> > +Support PREFETCHI built-in functions and code generation.
> > diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
> > index d8415863f52..ac6402653e0 100644
> > --- a/gcc/config/i386/immintrin.h
> > +++ b/gcc/config/i386/immintrin.h
> > @@ -134,6 +134,8 @@
> >
> >  #include <amxbf16intrin.h>
> >
> > +#include <prfchiintrin.h>
> > +
> >  #include <prfchwintrin.h>
> >
> >  #include <keylockerintrin.h>
> > diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
> > index c4141a96735..2a3f07224cc 100644
> > --- a/gcc/config/i386/predicates.md
> > +++ b/gcc/config/i386/predicates.md
> > @@ -610,6 +610,21 @@
> >    return false;
> >  })
> >
> > +(define_predicate "local_func_symbolic_operand"
> > +  (match_operand 0 "local_symbolic_operand")
> > +{
> > +  if (GET_CODE (op) == CONST
> > +      && GET_CODE (XEXP (op, 0)) == PLUS
> > +      && CONST_INT_P (XEXP (XEXP (op, 0), 1)))
> > +    op = XEXP (XEXP (op, 0), 0);
> > +
> > +  if (GET_CODE (op) == SYMBOL_REF
> > +      && !SYMBOL_REF_FUNCTION_P (op))
> > +    return false;
> > +
> > +  return true;
> > +})
>
> Will it return true for any memory address?  I think we should
No, I think it should first match local_symbolic_operand which also
supports code label.
>
> support code label and check for SYMBOL_REF_LOCAL_P.

It generates

foo_label:
.LFB6679:
        .cfi_startproc
.L4:
        prefetchit0     .L4(%rip)
        ret
        .cfi_endproc

for

void
foo_label ()
{
d:
  _mm_prefetch (&&d, _MM_HINT_IT0);
}

and warning
In function ‘_mm_prefetch’,
    inlined from ‘foo_r’ at prefetchi-1.c:18:2:
./gcc/include/xmmintrin.h:56:3: warning: instruction prefetch applies
when in 64-bit mode with RIP-relative addressing and option
‘-mprefetchi’; they stay NOPs otherwise
   56 |   __builtin_prefetch (__P, (__I & 0x4) >> 2, __I & 0x3, ((__I
& 0x10) >> 4) ^ 0x1);

For

void
foo_r (int* p)
{
        _mm_prefetch (p, _MM_HINT_IT0);
}

> --
> H.J.



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-19 21:06   ` Segher Boessenkool
  2022-10-20  1:39     ` Hongtao Liu
@ 2022-10-20  7:34     ` Jiang, Haochen
  2022-10-20 18:54       ` Segher Boessenkool
  1 sibling, 1 reply; 26+ messages in thread
From: Jiang, Haochen @ 2022-10-20  7:34 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: gcc-patches, rguenther, Liu, Hongtao, ubizjak, richard.earnshaw,
	richard.sandiford, marcus.shawcroft, kyrylo.tkachov, rth, gnu,
	claziss, nickc, ramana.radhakrishnan, aoliva, hubicka, mfortune,
	dje.gcc, linkw, uweigand, krebbel, olegendo, davem, ebotcazou,
	jeffreyalaw, dave.anglin

> -----Original Message-----
> From: Segher Boessenkool <segher@kernel.crashing.org>
> Sent: Thursday, October 20, 2022 5:07 AM
> To: Jiang, Haochen <haochen.jiang@intel.com>
> Cc: gcc-patches@gcc.gnu.org; rguenther@suse.de; Liu, Hongtao
> <hongtao.liu@intel.com>; ubizjak@gmail.com; richard.earnshaw@arm.com;
> richard.sandiford@arm.com; marcus.shawcroft@arm.com;
> kyrylo.tkachov@arm.com; rth@gcc.gnu.org; gnu@amylaar.uk;
> claziss@synopsys.com; nickc@redhat.com; ramana.radhakrishnan@arm.com;
> aoliva@gcc.gnu.org; hubicka@ucw.cz; mfortune@gmail.com;
> dje.gcc@gmail.com; linkw@gcc.gnu.org; uweigand@de.ibm.com;
> krebbel@linux.ibm.com; olegendo@gcc.gnu.org; davem@redhat.com;
> ebotcazou@libertysurf.fr; jeffreyalaw@gmail.com; dave.anglin@bell.net
> Subject: Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch
> to align with LLVM
> 
> On Fri, Oct 14, 2022 at 04:34:05PM +0800, Haochen Jiang wrote:
> > 	* config/s390/s390.cc (s390_expand_cpymem): Generate fourth
> parameter for
> 
> (Many too long lines here, this is the first one.  Changelog lines are
> max. 80 positions; a tab is eight).

I will change that in next patch.

> 
> > +  /* Argument 3 must be either zero or one.  */
> > +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> > +    {
> > +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> > +	" using one");
> 
> "using 1" makes sense maybe, but "using one" reads as "using an
> argument", not very sane.
> 
> An error would be better here anyway?

Will change to 1 to avoid confusion in that. The reason why this is a warning
is because previous ones related to constant arguments out of range in prefetch
are also using warning.

/* Argument 2 must be 0, 1, 2, or 3.  */
  if (INTVAL (op2) < 0 || INTVAL (op2) > 3)
    {
      warning (0, "invalid third argument to %<__builtin_prefetch%>; using zero");
      op2 = const0_rtx;
    }

Therefore I use warning to align with them.

> 
> > --- a/gcc/config/rs6000/rs6000.md
> > +++ b/gcc/config/rs6000/rs6000.md
> > @@ -14060,10 +14060,25 @@
> >    DONE;
> >  })
> >
> > -(define_insn "prefetch"
> > +(define_expand "prefetch"
> > +  [(prefetch (match_operand 0 "indexed_or_indirect_address")
> > +	     (match_operand:SI 1 "const_int_operand")
> > +	     (match_operand:SI 2 "const_int_operand")
> > +	     (match_operand:SI 3 "const_int_operand"))]
> > +  ""
> > +{
> > +  if (INTVAL (operands[3]) == 0)
> > +  {
> 
> Broken indentation.

I will fix that in updated patch.

> 
> > +    warning (0, "instruction prefetch is not supported; using data prefetch");
> 
> Please use a separate pattern for this, and leave prefetch to mean data
> prefetch, as documented!  Documentation you didn't change btw.  Call the
> new one instruction_prefetch or something equally boring maybe :-)
> 

Actually I changed documentation for prefetch but it is flooded in the patch
(Sorry for that).

In gcc/doc/rtl.texi

-@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality})
+@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality} @var{cache})
 
+Operand @var{cache} is 1 if the prefetch is prefetching data, 0 for prefetching
+instruction;
+targets that do not support instruction prefetch should treat all as data
+prefetch.
 
And for the implementation on the instruction prefetch, actually I have thought
of that way previously. But I chose the way how patch current goes for the
following reasons.

1. Previously we are using parameter to indicate r/w and locality in prefetch. I
suppose it is quite similar in this case. Since the pattern is already there, I prefer
reusing them.

2. It will be more natural for developers to extend their prefetch in future.

If anyone have points, welcome further discussion on that.

> When you send an updated patch, please split it up better?  Generic
> changes and documentation in one patch, target changes in a separate
> patch or patches, and testsuite is distinct as well.  It isn't nice to
> have to scroll through thousands of lines to see if there is anything
> relevant to you.

Really sorry for that. Hongtao has explained the reason for why we arrange
this patch and I will split the testcase to another patch.

Also if the change on testsuites on this patch change to minimal change,
the patch will be much smaller than current one.

BRs,
Haochen

> 
> Thanks,
> 
> 
> Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-20  1:44       ` Jiang, Haochen
@ 2022-10-20 17:25         ` Segher Boessenkool
  2022-10-20 17:37           ` Andrew Pinski
  0 siblings, 1 reply; 26+ messages in thread
From: Segher Boessenkool @ 2022-10-20 17:25 UTC (permalink / raw)
  To: Jiang, Haochen
  Cc: Andrew Pinski, gcc-patches, aoliva, richard.sandiford, uweigand,
	linkw, gnu, dje.gcc, olegendo, claziss, mfortune, davem,
	dave.anglin, hubicka, richard.earnshaw, rguenther,
	marcus.shawcroft, ramana.radhakrishnan, Liu, Hongtao

On Thu, Oct 20, 2022 at 01:44:15AM +0000, Jiang, Haochen wrote:
> Maybe the testcase change cause some misunderstanding and concern.
> 
> Actually, the patch did not disrupt the previous builtins, as the builtin_prefetch
> uses vargs. I set the default value of the new parameter as data prefetch, which
> means that if we are not using the fourth parameter, just like how we use
> prefetch previously, it is still what it is.

I still think it is a mistake to have one builtin do two very distinct
operations, only very superficially related.  Instruction fetch and data
demand loads are almosty entirely unrelated, and so is the prefetch
machinery for them, on all machines I am familiar with.  Which makes
sense anyway, since instruction prefetch and data prefetch have
completely different performance characteristics and considerations.
Maybe if you start with the mistake of having unified L1 caches it
seems natural, but thankfully most machines do not do that.


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-20 17:25         ` Segher Boessenkool
@ 2022-10-20 17:37           ` Andrew Pinski
  2022-10-21 10:17             ` Richard Earnshaw
  0 siblings, 1 reply; 26+ messages in thread
From: Andrew Pinski @ 2022-10-20 17:37 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Jiang, Haochen, gcc-patches, aoliva, richard.sandiford, uweigand,
	linkw, gnu, dje.gcc, olegendo, claziss, mfortune, davem,
	dave.anglin, hubicka, richard.earnshaw, rguenther,
	marcus.shawcroft, ramana.radhakrishnan, Liu, Hongtao

On Thu, Oct 20, 2022 at 10:28 AM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Thu, Oct 20, 2022 at 01:44:15AM +0000, Jiang, Haochen wrote:
> > Maybe the testcase change cause some misunderstanding and concern.
> >
> > Actually, the patch did not disrupt the previous builtins, as the builtin_prefetch
> > uses vargs. I set the default value of the new parameter as data prefetch, which
> > means that if we are not using the fourth parameter, just like how we use
> > prefetch previously, it is still what it is.
>
> I still think it is a mistake to have one builtin do two very distinct
> operations, only very superficially related.  Instruction fetch and data
> demand loads are almosty entirely unrelated, and so is the prefetch
> machinery for them, on all machines I am familiar with.

On aarch64 (armv8), it is actually the same instruction: PRFM. It
might be the only one which is that way though.
It even allows to specify the level for the instruction prefetch too
(which is actually useful for say OcteonTX2 which has an interesting
cache hierarchy).

Though I agree it is a mistake to have one builtin which handles both
data and instruction prefetch.

Thanks,
Andrew


> Which makes
> sense anyway, since instruction prefetch and data prefetch have
> completely different performance characteristics and considerations.
> Maybe if you start with the mistake of having unified L1 caches it
> seems natural, but thankfully most machines do not do that.
>
>
> Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-20  3:12       ` Hongtao Liu
@ 2022-10-20 18:44         ` Segher Boessenkool
  0 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2022-10-20 18:44 UTC (permalink / raw)
  To: Hongtao Liu
  Cc: Haochen Jiang, aoliva, richard.sandiford, uweigand, linkw, gnu,
	dje.gcc, olegendo, claziss, mfortune, gcc-patches, davem,
	dave.anglin, hubicka, richard.earnshaw, rguenther,
	marcus.shawcroft, ramana.radhakrishnan, hongtao.liu

On Thu, Oct 20, 2022 at 11:12:01AM +0800, Hongtao Liu wrote:
> On Thu, Oct 20, 2022 at 9:39 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > On Thu, Oct 20, 2022 at 5:08 AM Segher Boessenkool
> > > Please use a separate pattern for this, and leave prefetch to mean data
> > > prefetch, as documented!  Documentation you didn't change btw.  Call the
> > > new one instruction_prefetch or something equally boring maybe :-)
> Yes, Maybe we should add new rtl def named "iprefetch", so there will
> be no need to change other backend.

Spelling it out ("instruction_prefetch") is nicer.  This is not used
so often to warrant a cryptic short name.  But yes, that is what I am
asking for.


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-20  7:34     ` Jiang, Haochen
@ 2022-10-20 18:54       ` Segher Boessenkool
  2022-10-21  3:17         ` Jiang, Haochen
  2022-10-24 10:00         ` Richard Sandiford
  0 siblings, 2 replies; 26+ messages in thread
From: Segher Boessenkool @ 2022-10-20 18:54 UTC (permalink / raw)
  To: Jiang, Haochen
  Cc: gcc-patches, rguenther, Liu, Hongtao, ubizjak, richard.earnshaw,
	richard.sandiford, marcus.shawcroft, kyrylo.tkachov, rth, gnu,
	claziss, nickc, ramana.radhakrishnan, aoliva, hubicka, mfortune,
	dje.gcc, linkw, uweigand, krebbel, olegendo, davem, ebotcazou,
	jeffreyalaw, dave.anglin

On Thu, Oct 20, 2022 at 07:34:13AM +0000, Jiang, Haochen wrote:
> > > +  /* Argument 3 must be either zero or one.  */
> > > +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> > > +    {
> > > +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> > > +	" using one");
> > 
> > "using 1" makes sense maybe, but "using one" reads as "using an
> > argument", not very sane.
> > 
> > An error would be better here anyway?
> 
> Will change to 1 to avoid confusion in that. The reason why this is a warning
> is because previous ones related to constant arguments out of range in prefetch
> are also using warning.

Please don't repeat historical mistakes.  You might not want to fix the
existing code (since that can in theory break existing user code), but
that is not a reason to punish users of a new feature as well ;-)

> > Please use a separate pattern for this, and leave prefetch to mean data
> > prefetch, as documented!  Documentation you didn't change btw.  Call the
> > new one instruction_prefetch or something equally boring maybe :-)
> 
> Actually I changed documentation for prefetch but it is flooded in the patch
> (Sorry for that).

Oh huh, I looked for it but didn't find it.  Another argument for making
better patch series ;-)

> 1. Previously we are using parameter to indicate r/w and locality in prefetch. I
> suppose it is quite similar in this case. Since the pattern is already there, I prefer
> reusing them.

You can use the data prefetch RTL code for all data loads just as well,
it is more closely related than this -- but most people would call that
insanity!


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-20 18:54       ` Segher Boessenkool
@ 2022-10-21  3:17         ` Jiang, Haochen
  2022-10-24 10:00         ` Richard Sandiford
  1 sibling, 0 replies; 26+ messages in thread
From: Jiang, Haochen @ 2022-10-21  3:17 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: gcc-patches, rguenther, Liu, Hongtao, ubizjak, richard.earnshaw,
	richard.sandiford, marcus.shawcroft, kyrylo.tkachov, rth, gnu,
	claziss, nickc, ramana.radhakrishnan, aoliva, hubicka, mfortune,
	dje.gcc, linkw, uweigand, krebbel, olegendo, davem, ebotcazou,
	jeffreyalaw, dave.anglin



> -----Original Message-----
> From: Segher Boessenkool <segher@kernel.crashing.org>
> Sent: Friday, October 21, 2022 2:54 AM
> To: Jiang, Haochen <haochen.jiang@intel.com>
> Cc: gcc-patches@gcc.gnu.org; rguenther@suse.de; Liu, Hongtao
> <hongtao.liu@intel.com>; ubizjak@gmail.com; richard.earnshaw@arm.com;
> richard.sandiford@arm.com; marcus.shawcroft@arm.com;
> kyrylo.tkachov@arm.com; rth@gcc.gnu.org; gnu@amylaar.uk;
> claziss@synopsys.com; nickc@redhat.com; ramana.radhakrishnan@arm.com;
> aoliva@gcc.gnu.org; hubicka@ucw.cz; mfortune@gmail.com;
> dje.gcc@gmail.com; linkw@gcc.gnu.org; uweigand@de.ibm.com;
> krebbel@linux.ibm.com; olegendo@gcc.gnu.org; davem@redhat.com;
> ebotcazou@libertysurf.fr; jeffreyalaw@gmail.com; dave.anglin@bell.net
> Subject: Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch
> to align with LLVM
> 
> On Thu, Oct 20, 2022 at 07:34:13AM +0000, Jiang, Haochen wrote:
> > > > +  /* Argument 3 must be either zero or one.  */
> > > > +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> > > > +    {
> > > > +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> > > > +	" using one");
> > >
> > > "using 1" makes sense maybe, but "using one" reads as "using an
> > > argument", not very sane.
> > >
> > > An error would be better here anyway?
> >
> > Will change to 1 to avoid confusion in that. The reason why this is a warning
> > is because previous ones related to constant arguments out of range in
> prefetch
> > are also using warning.
> 
> Please don't repeat historical mistakes.  You might not want to fix the
> existing code (since that can in theory break existing user code), but
> that is not a reason to punish users of a new feature as well ;-)
> 
> > > Please use a separate pattern for this, and leave prefetch to mean data
> > > prefetch, as documented!  Documentation you didn't change btw.  Call
> the
> > > new one instruction_prefetch or something equally boring maybe :-)
> >
> > Actually I changed documentation for prefetch but it is flooded in the patch
> > (Sorry for that).
> 
> Oh huh, I looked for it but didn't find it.  Another argument for making
> better patch series ;-)
> 
> > 1. Previously we are using parameter to indicate r/w and locality in prefetch.
> I
> > suppose it is quite similar in this case. Since the pattern is already there, I
> prefer
> > reusing them.
> 
> You can use the data prefetch RTL code for all data loads just as well,
> it is more closely related than this -- but most people would call that
> insanity!

Maybe you got me here. I suppose I will write another patch for a new RTL to see
which implementation is better.

Thx,
Haochen

> 
> 
> Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-20 17:37           ` Andrew Pinski
@ 2022-10-21 10:17             ` Richard Earnshaw
  2022-10-21 18:08               ` Segher Boessenkool
  0 siblings, 1 reply; 26+ messages in thread
From: Richard Earnshaw @ 2022-10-21 10:17 UTC (permalink / raw)
  To: Andrew Pinski, Segher Boessenkool
  Cc: mfortune, dave.anglin, rguenther, aoliva, richard.sandiford,
	uweigand, hubicka, marcus.shawcroft, olegendo, gcc-patches,
	linkw, richard.earnshaw, ramana.radhakrishnan, davem, gnu, Liu,
	Hongtao, claziss, dje.gcc



On 20/10/2022 18:37, Andrew Pinski via Gcc-patches wrote:
> On Thu, Oct 20, 2022 at 10:28 AM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>>
>> On Thu, Oct 20, 2022 at 01:44:15AM +0000, Jiang, Haochen wrote:
>>> Maybe the testcase change cause some misunderstanding and concern.
>>>
>>> Actually, the patch did not disrupt the previous builtins, as the builtin_prefetch
>>> uses vargs. I set the default value of the new parameter as data prefetch, which
>>> means that if we are not using the fourth parameter, just like how we use
>>> prefetch previously, it is still what it is.
>>
>> I still think it is a mistake to have one builtin do two very distinct
>> operations, only very superficially related.  Instruction fetch and data
>> demand loads are almosty entirely unrelated, and so is the prefetch
>> machinery for them, on all machines I am familiar with.
> 
> On aarch64 (armv8), it is actually the same instruction: PRFM. It
> might be the only one which is that way though.
> It even allows to specify the level for the instruction prefetch too
> (which is actually useful for say OcteonTX2 which has an interesting
> cache hierarchy).
> 

Just because the encodings are similar doesn't mean that the 
instructions are the same, although it's true that once you reach 
unification in the cache hierarchy the end behaviour /might/ be 
indistinguishable.

Really, Segher's point seems to be 'why overload the existing builtin 
for this'?  It's not like the new parameter is something that users 
would really need to pass in as a run-time choice; and that wouldn't 
work anyway because in the end we do need distinct instructions.

R.

> Though I agree it is a mistake to have one builtin which handles both
> data and instruction prefetch.
> 
> Thanks,
> Andrew
> 
> 
>> Which makes
>> sense anyway, since instruction prefetch and data prefetch have
>> completely different performance characteristics and considerations.
>> Maybe if you start with the mistake of having unified L1 caches it
>> seems natural, but thankfully most machines do not do that.
>>
>>
>> Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-21 10:17             ` Richard Earnshaw
@ 2022-10-21 18:08               ` Segher Boessenkool
  0 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2022-10-21 18:08 UTC (permalink / raw)
  To: Richard Earnshaw
  Cc: Andrew Pinski, mfortune, dave.anglin, rguenther, aoliva,
	richard.sandiford, uweigand, hubicka, marcus.shawcroft, olegendo,
	gcc-patches, linkw, richard.earnshaw, ramana.radhakrishnan,
	davem, gnu, Liu, Hongtao, claziss, dje.gcc

On Fri, Oct 21, 2022 at 11:17:39AM +0100, Richard Earnshaw wrote:
> On 20/10/2022 18:37, Andrew Pinski via Gcc-patches wrote:
> >On aarch64 (armv8), it is actually the same instruction: PRFM. It
> >might be the only one which is that way though.
> >It even allows to specify the level for the instruction prefetch too
> >(which is actually useful for say OcteonTX2 which has an interesting
> >cache hierarchy).
> 
> Just because the encodings are similar doesn't mean that the 
> instructions are the same, although it's true that once you reach 
> unification in the cache hierarchy the end behaviour /might/ be 
> indistinguishable.

"Might", yes: for good results the hardware has to use very different
heuristics.  And of course it interacts with the hardware prefetchers
anyway (which are very different for code and data, and work a lot
better than software prefetch almost always for that matter).

> Really, Segher's point seems to be 'why overload the existing builtin 
> for this'?  It's not like the new parameter is something that users 
> would really need to pass in as a run-time choice; and that wouldn't 
> work anyway because in the end we do need distinct instructions.

Right.  The builtin as well as the RTL expressions.  But having nasty
builtin definitions hurts our users, and nasty RTL only ourselves ;-)


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-20 18:54       ` Segher Boessenkool
  2022-10-21  3:17         ` Jiang, Haochen
@ 2022-10-24 10:00         ` Richard Sandiford
  2022-10-24 21:19           ` Segher Boessenkool
  1 sibling, 1 reply; 26+ messages in thread
From: Richard Sandiford @ 2022-10-24 10:00 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Jiang, Haochen, gcc-patches, rguenther, Liu, Hongtao, ubizjak,
	richard.earnshaw, marcus.shawcroft, kyrylo.tkachov, rth, gnu,
	claziss, nickc, ramana.radhakrishnan, aoliva, hubicka, mfortune,
	dje.gcc, linkw, uweigand, krebbel, olegendo, davem, ebotcazou,
	jeffreyalaw, dave.anglin

Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Thu, Oct 20, 2022 at 07:34:13AM +0000, Jiang, Haochen wrote:
>> > > +  /* Argument 3 must be either zero or one.  */
>> > > +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
>> > > +    {
>> > > +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
>> > > +	" using one");
>> > 
>> > "using 1" makes sense maybe, but "using one" reads as "using an
>> > argument", not very sane.
>> > 
>> > An error would be better here anyway?
>> 
>> Will change to 1 to avoid confusion in that. The reason why this is a warning
>> is because previous ones related to constant arguments out of range in prefetch
>> are also using warning.
>
> Please don't repeat historical mistakes.  You might not want to fix the
> existing code (since that can in theory break existing user code), but
> that is not a reason to punish users of a new feature as well ;-)

I agree an error would be appropriate for something like
__builtin_clear_cache.  But __builtin_prefetch is a hint only.
Nothing should break if the compiler simply evaluates the arguments
and does nothing else.

Using a warning in that situation means that, if the ranges of
parameters are increased in future, older compilers won't needlessly
reject new code.

So personally I think we should stick with the current choice
of a default-on warning.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM
  2022-10-24 10:00         ` Richard Sandiford
@ 2022-10-24 21:19           ` Segher Boessenkool
  0 siblings, 0 replies; 26+ messages in thread
From: Segher Boessenkool @ 2022-10-24 21:19 UTC (permalink / raw)
  To: Jiang, Haochen, gcc-patches, rguenther, Liu, Hongtao, ubizjak,
	richard.earnshaw, marcus.shawcroft, kyrylo.tkachov, rth, gnu,
	claziss, nickc, ramana.radhakrishnan, aoliva, hubicka, mfortune,
	dje.gcc, linkw, uweigand, krebbel, olegendo, davem, ebotcazou,
	jeffreyalaw, dave.anglin, richard.sandiford

On Mon, Oct 24, 2022 at 11:00:26AM +0100, Richard Sandiford wrote:
> Segher Boessenkool <segher@kernel.crashing.org> writes:
> > On Thu, Oct 20, 2022 at 07:34:13AM +0000, Jiang, Haochen wrote:
> >> > > +  /* Argument 3 must be either zero or one.  */
> >> > > +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> >> > > +    {
> >> > > +      warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> >> > > +	" using one");
> >> > 
> >> > "using 1" makes sense maybe, but "using one" reads as "using an
> >> > argument", not very sane.
> >> > 
> >> > An error would be better here anyway?
> >> 
> >> Will change to 1 to avoid confusion in that. The reason why this is a warning
> >> is because previous ones related to constant arguments out of range in prefetch
> >> are also using warning.
> >
> > Please don't repeat historical mistakes.  You might not want to fix the
> > existing code (since that can in theory break existing user code), but
> > that is not a reason to punish users of a new feature as well ;-)
> 
> I agree an error would be appropriate for something like
> __builtin_clear_cache.  But __builtin_prefetch is a hint only.
> Nothing should break if the compiler simply evaluates the arguments
> and does nothing else.
> 
> Using a warning in that situation means that, if the ranges of
> parameters are increased in future, older compilers won't needlessly
> reject new code.

It means that if we want "2" to have a new meaning in the future, we can
not do that, since it will use the semantics of "1" on older compilers
(and that might well not be compatible).

And for what?  Is it ever so convenient for people to write random
numbers here?


Segher

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2022-10-24 21:20 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-14  8:34 [PATCH 0/2] Add a Fourth parameter for prefetch and Support Intel PREFETCHI Haochen Jiang
2022-10-14  8:34 ` [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM Haochen Jiang
2022-10-14  8:46   ` Hongtao Liu
2022-10-17 15:28   ` Richard Earnshaw
2022-10-18  9:20     ` [PATCH v2] " Haochen Jiang
2022-10-19 17:14   ` [PATCH 1/2] " Andrew Pinski
2022-10-19 21:14     ` Segher Boessenkool
2022-10-20  1:27       ` Hongtao Liu
2022-10-20  1:44       ` Jiang, Haochen
2022-10-20 17:25         ` Segher Boessenkool
2022-10-20 17:37           ` Andrew Pinski
2022-10-21 10:17             ` Richard Earnshaw
2022-10-21 18:08               ` Segher Boessenkool
2022-10-19 21:06   ` Segher Boessenkool
2022-10-20  1:39     ` Hongtao Liu
2022-10-20  3:12       ` Hongtao Liu
2022-10-20 18:44         ` Segher Boessenkool
2022-10-20  7:34     ` Jiang, Haochen
2022-10-20 18:54       ` Segher Boessenkool
2022-10-21  3:17         ` Jiang, Haochen
2022-10-24 10:00         ` Richard Sandiford
2022-10-24 21:19           ` Segher Boessenkool
2022-10-14  8:34 ` [PATCH 2/2] Support Intel prefetchit0/t1 Haochen Jiang
2022-10-20  3:45   ` H.J. Lu
2022-10-20  4:04     ` Hongtao Liu
2022-10-19 20:49 ` [PATCH 0/2] Add a Fourth parameter for prefetch and Support Intel PREFETCHI Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).