public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
From: Michael Meissner <meissner@gcc.gnu.org>
To: gcc-cvs@gcc.gnu.org
Subject: [gcc(refs/users/meissner/heads/work122)] Fix power10 fusion and -fstack-protector, PR target/105325
Date: Tue, 13 Jun 2023 17:21:33 +0000 (GMT)	[thread overview]
Message-ID: <20230613172133.2B2CB3858D38@sourceware.org> (raw)

https://gcc.gnu.org/g:816ce136d5776e4d82ac3beca98989ce58e77f1a

commit 816ce136d5776e4d82ac3beca98989ce58e77f1a
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Tue Jun 13 13:21:15 2023 -0400

    Fix power10 fusion and -fstack-protector, PR target/105325
    
    This patch fixes an issue where if you use the -fstack-protector and
    -mcpu=power10 options and you have a large stack frame, the GCC compiler will
    generate a LWA instruction with a large offset.
    
    The important thing in the bug is that -fstack-protector is used, but it could
    potentially happen with fused load-compare to any stack location when the stack
    frame is larger than 32K without -fstack-protector.
    
    What happens is the initial insn that is created is:
    
    (insn 6 5 7 2 (parallel [
                (set (reg:CC 119)
                     (compare:CC (mem/c:SI (plus:DI (reg/f:DI 110 sfp)
                                                    (const_int -4))
                                 (const_int 0 [0])))
                (clobber (scratch:DI))
            ])
         (nil))
    
    After the stack size is finalized, the frame pointer removed, and the post
    reload phase is run, the insn is now:
    
    (insn 6 5 7 2 (parallel [
                (set (reg:CC 100 0 [119])
                     (compare:CC (mem/c:SI (plus:DI (reg/f:DI 1 1)
                                                    (const_int 40044))
                                 (const_int 0 [0])))
                (clobber (reg:DI 9 9 [120]))
            ])
         (nil))
    
    When the split2 pass is run after reload has finished the ds_form_mem_operand
    predicate that used for lwa and ld no longer returns true.  This means that
    since the operand predicates aren't recognized, it won't be split.
    
    The solution involves:
    
        1)  Don't use ds_form_mem_operand for ld and lwa, always use
            non_update_memory_operand.
    
        2)  Delete ds_form_mem_operand since it is no longer use.
    
        3)  Use the "YZ" constraints for ld/lwa instead of "m".
    
        4)  Insure that the insn will be recognized as having a prefixed operand
            (and hence the instruction length is 16 bytes instead of 8 bytes).
    
            4a) Set the prefixed and maybe_prefix attributes to know that
                fused_load_cmpi are also load insns;
    
            4b) In the case where we are just setting CC and not using the memory
                afterward, set the clobber to use a DI register, and put an
                explicit sign_extend operation in the split;
    
            4c) Set the sign_extend attribute to "yes".
    
            4d) 4a-4c are the things that prefixed_load_p in rs6000.cc checks to
                ensure that lwa is treated as a ds-form instruction and not as
                a d-form instruction (i.e. lwz).
    
        5)  Add a new test case for this case.
    
        6)  Adjust the insn counts in fusion-p10-ldcmpi.c.  Because we are no
            longer using ds_form_mem_operand, the ld and lwa instructions will fuse
            x-form (reg+reg) addresses in addition ds-form (reg+offset or reg).
    
    2023-06-12   Michael Meissner  <meissner@linux.ibm.com>
    
    gcc/
    
            * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): Fix problems that
            allowed prefixed lwa to be generated.
            * config/rs6000/fusion.md: Regenerate.
            * config/rs6000/predicates.md (ds_form_mem_operand): Delete.
            * config/rs6000/rs6000.md (prefixed attribute): Add support for load
            plus compare immediate fused insns.
            (maybe_prefixed): Likewise.
    
    gcc/testsuite/
    
            * g++.target/powerpc/pr105325.C: New test.
            * gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c: Update insn
            counts.

Diff:
---
 gcc/config/rs6000/fusion.md                        | 23 ++++++++------
 gcc/config/rs6000/genfusion.pl                     | 37 +++++++++++++++++++---
 gcc/config/rs6000/predicates.md                    | 14 --------
 gcc/config/rs6000/rs6000.md                        |  4 +--
 gcc/testsuite/g++.target/powerpc/pr105325.C        | 26 +++++++++++++++
 .../gcc.target/powerpc/fusion-p10-ldcmpi.c         | 14 ++++----
 6 files changed, 81 insertions(+), 37 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index d45fb138a70..9eefae22a1a 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -22,7 +22,7 @@
 ;; load mode is DI result mode is clobber compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "YZ")
                     (match_operand:DI 3 "const_m1_to_1_operand" "n")))
    (clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION)"
@@ -43,7 +43,7 @@
 ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
 (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
   [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
-        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "YZ")
                        (match_operand:DI 3 "const_0_to_1_operand" "n")))
    (clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION)"
@@ -64,7 +64,7 @@
 ;; load mode is DI result mode is DI compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:DI 1 "non_update_memory_operand" "YZ")
                     (match_operand:DI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -85,7 +85,7 @@
 ;; load mode is DI result mode is DI compare mode is CCUNS extend is none
 (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
   [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
-        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "YZ")
                        (match_operand:DI 3 "const_0_to_1_operand" "n")))
    (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -106,21 +106,22 @@
 ;; load mode is SI result mode is clobber compare mode is CC extend is none
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
-   (clobber (match_scratch:SI 0 "=r"))]
+   (clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION)"
   "lwa%X1 %0,%1\;cmpdi %2,%0,%3"
   "&& reload_completed
    && (cc_reg_not_cr0_operand (operands[2], CCmode)
        || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),
                                       SImode, NON_PREFIXED_DS))"
-  [(set (match_dup 0) (match_dup 1))
+  [(set (match_dup 0) (sign_extend:DI (match_dup 1)))
    (set (match_dup 2)
         (compare:CC (match_dup 0) (match_dup 3)))]
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
@@ -148,7 +149,7 @@
 ;; load mode is SI result mode is SI compare mode is CC extend is none
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -157,12 +158,13 @@
    && (cc_reg_not_cr0_operand (operands[2], CCmode)
        || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),
                                       SImode, NON_PREFIXED_DS))"
-  [(set (match_dup 0) (match_dup 1))
+  [(set (match_dup 0) (sign_extend:DI (match_dup 1)))
    (set (match_dup 2)
         (compare:CC (match_dup 0) (match_dup 3)))]
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
@@ -190,7 +192,7 @@
 ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))]
   "(TARGET_P10_FUSION)"
@@ -205,6 +207,7 @@
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 82e8f863b02..31ee54aea93 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -61,15 +61,23 @@ sub gen_ld_cmpi_p10_one
   my $mempred = "non_update_memory_operand";
   my $extend;
 
+  # We need to special case lwa.  The prefixed_load_p function in rs6000.cc
+  # (which determines if a load instruction is prefixed) uses the fact that the
+  # register mode is different from the memory mode, and that the sign_extend
+  # attribute is set to use DS-form rules for the address instead of D-form.
+  # If the register size is the same, prefixed_load_p assumes we are doing a
+  # lwz.
+  my $lwa_insn = ($lmode eq "SI" && $ccmode eq "CC");
+
   if ($ccmode eq "CC") {
     # ld and lwa are both DS-FORM.
     ($lmode =~ /^[SD]I$/) and $np = "NON_PREFIXED_DS";
-    ($lmode =~ /^[SD]I$/) and $mempred = "ds_form_mem_operand";
+#   ($lmode =~ /^[SD]I$/) and $mempred = "ds_form_mem_operand";
   } else {
     if ($lmode eq "DI") {
       # ld is DS-form, but lwz is not.
       $np = "NON_PREFIXED_DS";
-      $mempred = "ds_form_mem_operand";
+ #    $mempred = "ds_form_mem_operand";
     }
   }
 
@@ -81,7 +89,9 @@ sub gen_ld_cmpi_p10_one
 
   # For clobber, we need a SI/DI reg in case we
   # split because we have to sign/zero extend.
-  my $clobbermode = ($lmode =~ /^[QH]I$/) ? "GPR" : $lmode;
+  my $clobbermode = (($lmode =~ /^[QH]I$/)
+		     ? "GPR"
+		     : ($lwa_insn ? "DI" : $lmode));
   if ($result =~ /^EXT/ || $result eq "GPR" || $clobbermode eq "GPR") {
     # We always need extension if result > lmode.
     $extend = ($ccmode eq "CC") ? "sign" : "zero";
@@ -91,12 +101,15 @@ sub gen_ld_cmpi_p10_one
   }
 
   my $ldst = mode_to_ldst_char($lmode);
+
+  # DS-form addresses need YZ, and not m.
+  my $constraint = ($np eq "NON_PREFIXED_DS") ? "YZ" : "m";
   print <<HERE;
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
 ;; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend
 (define_insn_and_split "*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}"
   [(set (match_operand:${ccmode} 2 "cc_reg_operand" "=x")
-        (compare:${ccmode} (match_operand:${lmode} 1 "${mempred}" "m")
+        (compare:${ccmode} (match_operand:${lmode} 1 "${mempred}" "${constraint}")
 HERE
   print "   " if $ccmode eq "CCUNS";
 print <<HERE;
@@ -126,7 +139,12 @@ HERE
                                       ${lmode}mode, ${np}))"
 HERE
 
-  if ($extend eq "none") {
+  # prefixed_load_p needs to see the register mode being different than the
+  # memory insn in order to validate lwa as a DS-form instruction and not a
+  # D-form instruction.
+  if ($lwa_insn && $extend eq "none") {
+    print "  [(set (match_dup 0) (sign_extend:${clobbermode} (match_dup 1)))\n";
+  } elsif ($extend eq "none") {
     print "  [(set (match_dup 0) (match_dup 1))\n";
   } elsif ($result eq "clobber") {
     print "  [(set (match_dup 0) (${extend}_extend:${clobbermode} (match_dup 1)))\n";
@@ -140,6 +158,15 @@ HERE
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+HERE
+
+  if ($lwa_insn) {
+    # prefixed_load_p needs the sign_extend attribute to validate lwa as a
+    # DS-form instruction instead of D-form.
+    print "   (set_attr \"sign_extend\" \"yes\")\n";
+  }
+
+  print <<HERE
    (set_attr "length" "8")])
 
 HERE
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index a16ee30f0c0..6b564837c6e 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1125,20 +1125,6 @@
   return INTVAL (offset) % 4 == 0;
 })
 
-;; Return 1 if the operand is a memory operand that has a valid address for
-;; a DS-form instruction. I.e. the address has to be either just a register,
-;; or register + const where the two low order bits of const are zero.
-(define_predicate "ds_form_mem_operand"
-  (match_code "subreg,mem")
-{
-  if (!any_memory_operand (op, mode))
-    return false;
-
-  rtx addr = XEXP (op, 0);
-
-  return address_to_insn_form (addr, mode, NON_PREFIXED_DS) == INSN_FORM_DS;
-})
-
 ;; Return 1 if the operand, used inside a MEM, is a SYMBOL_REF.
 (define_predicate "symbol_ref_operand"
   (and (match_code "symbol_ref")
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b0db8ae508d..75c5e5fc93d 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -287,7 +287,7 @@
 ;; Whether this insn has a prefixed form and a non-prefixed form.
 (define_attr "maybe_prefixed" "no,yes"
   (if_then_else (eq_attr "type" "load,fpload,vecload,store,fpstore,vecstore,
-  				 integer,add")
+  				 integer,add,fused_load_cmpi")
 		(const_string "yes")
 		(const_string "no")))
 
@@ -302,7 +302,7 @@
 	      (eq_attr "maybe_prefixed" "no"))
 	 (const_string "no")
 
-	 (eq_attr "type" "load,fpload,vecload")
+	 (eq_attr "type" "load,fpload,vecload,fused_load_cmpi")
 	 (if_then_else (match_test "prefixed_load_p (insn)")
 		       (const_string "yes")
 		       (const_string "no"))
diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C b/gcc/testsuite/g++.target/powerpc/pr105325.C
new file mode 100644
index 00000000000..d0e66a0b897
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr105325.C
@@ -0,0 +1,26 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -fstack-protector" } */
+
+/* Test that power10 fusion does not generate an LWA/CMPDI instruction pair
+   instead of PLWZ/CMPWI.  Ultimately the code was dying because the fusion
+   load + compare -1/0/1 patterns did not handle the possibility that the load
+   might be prefixed.  The -fstack-protector option is needed to show the
+   bug.  */
+
+struct Ath__array1D {
+  int _current;
+  int getCnt() { return _current; }
+};
+struct extMeasure {
+  int _mapTable[10000];
+  Ath__array1D _metRCTable;
+};
+void measureRC() {
+  extMeasure m;
+  for (; m._metRCTable.getCnt();)
+    for (;;)
+      ;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
index 526a026d874..3efbb34f2b4 100644
--- a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
+++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
@@ -54,14 +54,14 @@ TEST(uint8_t)
 TEST(int8_t)
 
 /* { dg-final { scan-assembler-times "lbz_cmpldi_cr0_QI_clobber_CCUNS_zero"   4 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_DI_CC_none"             4 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_clobber_CC_none"        4 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_DI_CCUNS_none"         1 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_clobber_CCUNS_none"    1 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_DI_CC_none"            24 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_clobber_CC_none"        8 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_DI_CCUNS_none"         2 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_clobber_CCUNS_none"    2 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign"      16 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"   4 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign"         0 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       4 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       8 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"     0 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none"   2 { target lp64 } } } */
 
@@ -73,6 +73,8 @@ TEST(int8_t)
 /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign"       8 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"   2 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign"         0 { target ilp32 } } } */
-/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       9 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_SI_CC_none"           36 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"      16 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"     0 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none"   6 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_SI_CCUNS_none"        0 { target ilp32 } } } */

             reply	other threads:[~2023-06-13 17:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-13 17:21 Michael Meissner [this message]
  -- strict thread matches above, loose matches on Subject: below --
2023-06-13 23:24 Michael Meissner
2023-06-13  2:57 Michael Meissner
2023-06-09 21:04 Michael Meissner
2023-06-09  6:08 Michael Meissner
2023-06-09  4:17 Michael Meissner
2023-06-09  1:32 Michael Meissner
2023-06-08 21:20 Michael Meissner
2023-06-08 20:23 Michael Meissner
2023-06-08 16:53 Michael Meissner
2023-06-08 14:38 Michael Meissner
2023-06-07 20:58 Michael Meissner
2023-06-07 18:56 Michael Meissner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230613172133.2B2CB3858D38@sourceware.org \
    --to=meissner@gcc.gnu.org \
    --cc=gcc-cvs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).