From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-200006-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 6322 invoked by alias); 23 Aug 2019 17:17:07 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 6309 invoked by uid 89); 23 Aug 2019 17:17:07 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-19.2 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_PASS autolearn=ham version=3.3.1 spammy=plug, alternatives, optabs, VOID_TYPE
X-HELO: mx1.suse.de
Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 23 Aug 2019 17:17:05 +0000
X-Amavis-Alert: BAD HEADER SECTION, Duplicate header field: "Cc"
Received: from relay2.suse.de (unknown [195.135.220.254])	by mx1.suse.de (Postfix) with ESMTP id 6B512AD22;	Fri, 23 Aug 2019 17:17:01 +0000 (UTC)
From: Martin Jambor <mjambor@suse.cz>
To: Segher Boessenkool <segher@kernel.crashing.org>, Tejas Joshi <tejasjoshi9673@gmail.com>
Cc: gcc@gcc.gnu.org, hubicka@ucw.cz, joseph@codesourcery.com, Richard Sandiford <richard.sandiford@arm.com>
Cc:
Subject: Re: Expansion of narrowing math built-ins into power instructions
In-Reply-To: <20190822095620.GK31406@gate.crashing.org>
References: <mptlfvo3sdh.fsf@arm.com> <20190820134613.GR31406@gate.crashing.org> <mpt8srn523k.fsf@arm.com> <20190820194154.GY31406@gate.crashing.org> <CACMrGjByVtJNmckrmMkntZO0ik+5xyfWgp=D8TpwU0PYd9fNSQ@mail.gmail.com> <20190821182852.GG31406@gate.crashing.org> <20190821191722.GH31406@gate.crashing.org> <CACMrGjCa0dLFpF_WNA0y3cHfs+Yu4fBBhy3F4JuTCKr9aUkW1Q@mail.gmail.com> <20190822062503.GI31406@gate.crashing.org> <CACMrGjD9oR6jJrN+XJZYsEW+Oheo4uV2JRxm35KX326A07ayrQ@mail.gmail.com> <20190822095620.GK31406@gate.crashing.org>
User-Agent: Notmuch/0.29.1 (https://notmuchmail.org) Emacs/26.2 (x86_64-suse-linux-gnu)
Date: Fri, 23 Aug 2019 17:17:00 -0000
Message-ID: <ri6h867dcok.fsf@suse.cz>
MIME-Version: 1.0
Content-Type: text/plain
X-IsSubscribed: yes
X-SW-Source: 2019-08/txt/msg00200.txt.bz2

Hello,

On Thu, Aug 22 2019, Segher Boessenkool wrote:
>> > Hi Tejas,
>> >
>> > [ Please do not top-post. ]
>
> On Thu, Aug 22, 2019 at 01:27:06PM +0530, Tejas Joshi wrote:
>> > What happens then?  "It does not work" is very very vague.  At least it
>> > seems the compiler does build now?
>> 
>> Oh, compiler builds but instruction is still "bl fadd". It should be
>> "fadds" right?
>
> Yes, but that means the problem is earlier, before it hits RTL perhaps.
>
> Compile with -dap, look at the expand dump (the lowest numbered one, 234
> or so), and see what it looked like in the final Gimple, and then in the
> RTL generated from that.  And then drill down.
>

Tejas sent me his patch and I looked at why it did not work.  I found
two reasons:

1. associated_internal_fn (in builtins.c) does not handle
   DEF_INTERNAL_OPTAB_FN kind of internal functions, and Tejas
   (sensibly, I'd say) used that macro to define the internal function.
   But when I worked around that by manually adding a case for it in the
   switch statement, I ran into an assert because...

2. direct_internal_fn_supported_p on which replacement_internal_fn
   depends to expand built-ins as internal functions cannot handle
   conversion optabs... and narrowing is a kind of conversion and the
   optab is added as such with OPTAB_CD.

Actually, the second statement is not entirely true because somehow it
can handle optab while_ult which is a conversion optab but a) the way it
is handled, if I can understand it at all, seems to be a big hack and
would be even worse if we decided to copy that for all narrowing math
functions and b) it gets both modes from argument types whereas we need
one from the result type and so we would have to rewrite
replacement_internal_fn anyway.

Therefore, at least for now (GSoC deadline is kind of looming), I
decided that the best way forward would be to not rely on internal
functions but plug into expand_builtin() and I wrote the following,
lightly tested patch - which of course misses testcases and stuff - but
I'd be curious about any feedback now anyway.  When I proposed a very
similar approach for the roundeven x86_64 expansion, Uros actually then
opted for a solution based on internal functions, so I am curious
whether there are simple alternatives I do not see.

Tejas, of course cases for other fadd variants should at least be added
to expand_builtin.

Thanks,

Martin


2019-08-23  Tejas Joshi  <tejasjoshi9673@gmail.com>
	    Martin Jambor  <mjambor@suse.cz>

	* builtins.c (expand_builtin_binary_conversion): New function.
	  (expand_builtin): Call it.
	* config/rs6000/rs6000.md (unspec): Add UNSPEC_ADD_NARROWING.
	(add_truncdfsf3): New define_insn.
	* optabs.def (fadd_optab): New.


---
 gcc/builtins.c              | 55 +++++++++++++++++++++++++++++++++++++
 gcc/config/rs6000/rs6000.md | 13 +++++++++
 gcc/internal-fn.def         |  2 ++
 gcc/optabs.def              |  1 +
 4 files changed, 71 insertions(+)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9a766e4ad63..a9bf5710834 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -2935,6 +2935,54 @@ expand_builtin_powi (tree exp, rtx target)
   return target;
 }
 
+/* Attempt to expand a builtin function call EXP which performs a binary
+   operation on its floating point arguments and then converts the result into
+   a different floating point format.  The operation in question is specified
+   in OP_OPTAB.  Return NULL if the attempt failed.  SUBTARGET may be used as
+   the target for computing the operand of EXP.  */
+
+static rtx
+expand_builtin_binary_conversion (tree exp, rtx target, rtx subtarget,
+				  optab op_optab)
+{
+  if (TREE_CODE (TREE_TYPE (exp)) != REAL_TYPE
+      || !validate_arglist (exp, REAL_TYPE, REAL_TYPE, VOID_TYPE))
+    return NULL_RTX;
+
+  tree arg0 = CALL_EXPR_ARG (exp, 0);
+  tree arg1 = CALL_EXPR_ARG (exp, 1);
+  gcc_assert (TYPE_MAIN_VARIANT (TREE_TYPE (arg0))
+	      == TYPE_MAIN_VARIANT (TREE_TYPE (arg1)));
+  machine_mode arg_mode = TYPE_MODE (TREE_TYPE (arg1));
+  machine_mode res_mode = TYPE_MODE (TREE_TYPE (exp));
+
+  insn_code icode = convert_optab_handler (op_optab, res_mode, arg_mode);
+  if (icode == CODE_FOR_nothing)
+    return NULL_RTX;
+
+  /* Wrap the computation of the arguments in a SAVE_EXPR, as we may
+     need to expand the argument again.  This way, we will not perform
+     side-effects more the once.  */
+  CALL_EXPR_ARG (exp, 0) = arg0 = builtin_save_expr (arg0);
+  CALL_EXPR_ARG (exp, 1) = arg1 = builtin_save_expr (arg1);
+
+  rtx op0 = expand_expr (arg0, subtarget, VOIDmode, EXPAND_NORMAL);
+  rtx op1 = expand_expr (arg1, subtarget, VOIDmode, EXPAND_NORMAL);
+
+  struct expand_operand ops[3];
+  create_output_operand (&ops[0], target, res_mode);
+  create_input_operand (&ops[1], op0, arg_mode);
+  create_input_operand (&ops[2], op1, arg_mode);
+  rtx_insn *pat = maybe_gen_insn (icode, 3, ops);
+  if (pat)
+    {
+      emit_insn (pat);
+      return ops[0].value;
+    }
+
+  return NULL_RTX;
+}
+
 /* Expand expression EXP which is a call to the strlen builtin.  Return
    NULL_RTX if we failed and the caller should emit a normal call, otherwise
    try to get the result in TARGET, if convenient.  */
@@ -7392,6 +7440,13 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
 	return target;
       break;
 
+    case BUILT_IN_FADD:
+      target = expand_builtin_binary_conversion (exp, target, subtarget,
+						 fadd_optab);
+      if (target)
+	return target;
+      break;
+
     case BUILT_IN_APPLY_ARGS:
       return expand_builtin_apply_args ();
 
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 9a7a1da987f..b44783a5028 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -89,6 +89,7 @@
    UNSPEC_TLSGOTTPREL
    UNSPEC_TLSTLS
    UNSPEC_FIX_TRUNC_TF		; fadd, rounding towards zero
+   UNSPEC_ADD_NARROWING		; fadd, narrow down to return type
    UNSPEC_STFIWX
    UNSPEC_POPCNTB
    UNSPEC_FRES
@@ -4653,6 +4654,18 @@
   [(set_attr "type" "fp")
    (set_attr "isa" "*,<Fisa>")])
 
+(define_insn "add_truncdfsf3"
+  [(set (match_operand:SF 0 "gpc_reg_operand" "=f,wa")
+	(unspec:SF [(match_operand:DF 1 "gpc_reg_operand" "%d,wa")
+		    (match_operand:DF 2 "gpc_reg_operand" "d,wa")]
+		     UNSPEC_ADD_NARROWING))]
+  "TARGET_HARD_FLOAT"
+  "@
+   fadds %0,%1,%2
+   xsaddsp %x0,%x1,%x2"
+  [(set_attr "type" "fp")
+   (set_attr "isa" "*,p8v")])
+
 (define_expand "sub<mode>3"
   [(set (match_operand:SFDF 0 "gpc_reg_operand")
 	(minus:SFDF (match_operand:SFDF 1 "gpc_reg_operand")
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 9461693bcd1..3f56880c23f 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -140,6 +140,8 @@ DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | ECF_NOTHROW, while_ult, while)
 DEF_INTERNAL_OPTAB_FN (VEC_SHL_INSERT, ECF_CONST | ECF_NOTHROW,
 		       vec_shl_insert, binary)
 
+DEF_INTERNAL_OPTAB_FN (FADD, ECF_CONST, fadd, binary)
+
 DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, fms, ternary)
 DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary)
 DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary)
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 5283e6753f2..209369e9da1 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -67,6 +67,7 @@ OPTAB_CD(sfixtrunc_optab, "fix_trunc$F$b$I$a2")
 OPTAB_CD(ufixtrunc_optab, "fixuns_trunc$F$b$I$a2")
 
 /* Misc optabs that use two modes; model them as "conversions".  */
+OPTAB_CD(fadd_optab, "add_trunc$b$a3")
 OPTAB_CD(smul_widen_optab, "mul$b$a3")
 OPTAB_CD(umul_widen_optab, "umul$b$a3")
 OPTAB_CD(usmul_widen_optab, "usmul$b$a3")
-- 
2.22.0