public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: gcc-patches@gcc.gnu.org
Cc: law@redhat.com
Subject: Re: [PATCH] Fix PR84101, account for function ABI details in vectorization costs
Date: Wed, 31 Jan 2018 14:14:00 -0000	[thread overview]
Message-ID: <alpine.LSU.2.20.1801311419500.18265@zhemvz.fhfr.qr> (raw)
In-Reply-To: <alpine.LSU.2.20.1801301530590.18265@zhemvz.fhfr.qr>

On Tue, 30 Jan 2018, Richard Biener wrote:

> On Tue, 30 Jan 2018, Richard Biener wrote:
> 
> > 
> > This patch tries to deal with the "easy" part of a function ABI,
> > the return value location, in vectorization costing.  The testcase
> > shows that if we vectorize the returned value but the function
> > doesn't return in memory or in a vector register but as in this
> > case in an integer register pair (reg:TI ax) (bah, ABI details
> > exposed late?  why's this not a parallel?) we end up spilling
> > badly.
> > 
> > The idea is to account for such spilling so if vectorization
> > benefits outweight the spilling we'll vectorize anyway.
> > 
> > I think the particular testcase could be fixed in the subreg
> > pass basically undoing the vectorization but I realize that
> > generally this is a too hard problem and avoiding vectorization
> > is better.  Still this patch is somewhat fragile in that it
> > depends on us "seeing" that the stored to decl is returned
> > (see cfun_returns).
> > 
> > Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> > 
> > I'd like to hear opinions on my use of hard_function_value
> > and also from other target maintainers.  I'm not sure we
> > have sufficient testsuite coverage of _profitable_ vectorization
> > of a return value.  Feel free to add to this for your
> > target.
> > 
> > Ok for trunk?
> 
> Ok, looks like hard_function_value doesn't like being called
> with
> 
>     type <record_type 0x7ffff65009d8 optional asm_written 
> needs-constructing type_1 type_5 type_6 BLK
>         size <integer_cst 0x7ffff64ee558 constant 8008>
> 
> we then ICE:
> 
> #0  fancy_abort (
>     file=0x2094ef0 
> "/space/rguenther/src/svn/early-lto-debug/gcc/machmode.h", 
>     line=292, 
>     function=0x2095534 <opt_mode<scalar_int_mode>::require() 
> const::__FUNCTION__> "require") at 
> /space/rguenther/src/svn/early-lto-debug/gcc/diagnostic.c:1500
> #1  0x0000000000c34790 in opt_mode<scalar_int_mode>::require (
>     this=0x7fffffffcf20)
>     at /space/rguenther/src/svn/early-lto-debug/gcc/machmode.h:292
> #2  0x0000000000db271a in hard_function_value (valtype=0x7ffff65009d8, 
>     func=0x7ffff64fff00, fntype=0x0, outgoing=1)
>     at /space/rguenther/src/svn/early-lto-debug/gcc/explow.c:2212
> 
> 2201          /* int_size_in_bytes can return -1.  We don't need a check 
> here
> 2202             since the value of bytes will then be large enough that 
> no
> 2203             mode will match anyway.  */
> 2204
> 2205          FOR_EACH_MODE_IN_CLASS (tmpmode, MODE_INT)
> 2206            {
> (gdb) l
> 2207              /* Have we found a large enough mode?  */
> 2208              if (GET_MODE_SIZE (tmpmode.require ()) >= bytes)
> 2209                break;
> 2210            }
> 2211
> 2212          PUT_MODE (val, tmpmode.require ());
> 
> which means we didn't find an integer mode of the wanted size.
> The backend simply returned (reg:BLK 0 ax).  I suppose I have
> to guard hard_function_value with sth ...

!aggregate_value_p seems to do the trick.

Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk?

Thanks,
Richard.

2018-01-31  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/84101
	* tree-vect-stmts.c: Include explow.h for hard_function_value.
	(cfun_returns): New helper.
	(vect_model_store_cost): When vectorizing a store to a decl
	we return and the function ABI returns via a non-vector
	register account for the possible spilling that will happen.

	* gcc.target/i386/pr84101.c: New testcase.

Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	(revision 257181)
+++ gcc/tree-vect-stmts.c	(working copy)
@@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.
 #include "tree-cfg.h"
 #include "tree-ssa-loop-manip.h"
 #include "cfgloop.h"
+#include "explow.h"
 #include "tree-ssa-loop.h"
 #include "tree-scalar-evolution.h"
 #include "tree-vectorizer.h"
@@ -893,6 +894,22 @@ vect_model_promotion_demotion_cost (stmt
                      "prologue_cost = %d .\n", inside_cost, prologue_cost);
 }
 
+/* Returns true if the current function returns DECL.  */
+
+static bool
+cfun_returns (tree decl)
+{
+  edge_iterator ei;
+  edge e;
+  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
+    {
+      greturn *ret = safe_dyn_cast <greturn *> (last_stmt (e->src));
+      if (ret && gimple_return_retval (ret) == decl)
+	return true;
+    }
+  return false;
+}
+
 /* Function vect_model_store_cost
 
    Models cost for stores.  In the case of grouped accesses, one access
@@ -971,6 +988,37 @@ vect_model_store_cost (stmt_vec_info stm
 				       vec_to_scalar, stmt_info, 0, vect_body);
     }
 
+  /* When vectorizing an SLP store into the function result assign
+     a penalty if the function returns in a non-vector register.
+     In this case we assume we'll end up with having to spill the
+     vector result and do element loads.  */
+  if (slp_node)
+    {
+      tree base = get_base_address (dr->ref);
+      if (base
+	  && (TREE_CODE (base) == RESULT_DECL
+	      || (DECL_P (base) && cfun_returns (base)))
+	  && !aggregate_value_p (base, cfun->decl))
+	{
+	  rtx reg = hard_function_value (TREE_TYPE (base), cfun->decl, 0, 1);
+	  if ((REG_P (reg) && ! VECTOR_MODE_P (GET_MODE (reg)))
+	      || GET_CODE (reg) == PARALLEL
+	      || GET_CODE (reg) == CONCAT)
+	    {
+	      /* Spill.  */
+	      inside_cost += record_stmt_cost (body_cost_vec,
+					       ncopies, vector_store,
+					       stmt_info, 0, vect_body);
+	      /* Element loads.  */
+	      unsigned int assumed_nunits = vect_nunits_for_cost (vectype);
+	      inside_cost += record_stmt_cost (body_cost_vec,
+					       ncopies * assumed_nunits,
+					       scalar_load,
+					       stmt_info, 0, vect_body);
+	    }
+	}
+    }
+
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location,
                      "vect_model_store_cost: inside_cost = %d, "
Index: gcc/testsuite/gcc.target/i386/pr84101.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr84101.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr84101.c	(working copy)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-slp2-details" } */
+
+typedef struct uint64_pair uint64_pair_t ;
+struct uint64_pair
+{
+  unsigned long w0 ;
+  unsigned long w1 ;
+} ;
+
+uint64_pair_t pair(int num)
+{
+  uint64_pair_t p ;
+
+  p.w0 = num << 1 ;
+  p.w1 = num >> 1 ;
+
+  return p ;
+}
+
+/* { dg-final { scan-tree-dump-not "basic block vectorized" "slp2" } } */

  reply	other threads:[~2018-01-31 13:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-30 10:48 Richard Biener
2018-01-30 14:51 ` Richard Biener
2018-01-31 14:14   ` Richard Biener [this message]
2018-02-14  5:25 ` Jeff Law
2018-02-14 11:52   ` Richard Biener
2018-02-14 12:49     ` Jakub Jelinek
2018-02-16 23:49       ` Jeff Law
2018-06-28  5:52     ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.20.1801311419500.18265@zhemvz.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).