public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] rs6000: Handle "d" output in the bd*z patterns (PR70098)
@ 2016-03-12 13:55 Segher Boessenkool
  2016-03-12 14:59 ` David Edelsohn
  0 siblings, 1 reply; 5+ messages in thread
From: Segher Boessenkool @ 2016-03-12 13:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: dje.gcc, Segher Boessenkool

In the rs6000 port, FLOAT_REGS can contain DImode values when compiling
for 64-bit targets.  Some instructions (like "fcfid" in the testcase,
convert from integer to DP float) only work on floating point registers.
So, we do want to allow DImode in these regs.

Now, in unusual cases IRA will assign FLOAT_REGS to some allocno where
some insns cannot handle FLOAT_REGS there, so they will need a reload.
Maybe IRA can be made smarter, but it isn't doing anything wrong here,
so we should be able to handle it.

The place it goes wrong is in the output of the *ctrdi_internal[1256]
pattern: the "bdz" and "bdnz" instructions.  GCC refuses to do output
reloads on JUMP_INSNs, probably because it is hard to do, needs different
strategies than "normal" reloads do, and it cannot even be done at all
for general patterns.  So JUMP_INSNs need to be able to handle every
possible output for the register class used.

These patterns already handle writing to "c" (the base insn case), and
to "r", "m", and "c" or "l"; all those via splitters.  We just need to
handle "d" as well.  That is what this patch does.  [A predicate in one
of the splitters needs to be touched up so that the correct splitter
is used for the FLOAT_REGS case.]

But, that leaves another problem.  One of the insns that are split to
is a move from a GPR to an FPR.  That work fine on targets with direct
move (which does exactly that), i.e. power8 and up.  But older targets
need memory to do the move, and this splitter runs after reload so
it cannot allocate memory; and allocating memory beforehand for every
bdnz insn is pretty horrible as well.

This patch implements the easy part.  With it, power8 works, where it
didn't before.

Tested on powerpc64-linux, -m32 and -m64, -mlra and -mno-lra.  Also
regstrapping on a power8 powerpc64le-linux.  Is this okay for trunk
if that works as expected?


Segher


2016-03-12  Segher Boessenkool  <segher@kernel.crashing.org>

	PR target/70098
	* config/rs6000/rs6000.md (*ctr<mode>_internal1, *ctr<mode>_internal2,
	*ctr<mode>_internal5, *ctr<mode>_internal6): Also allow "d" as output.
	(define_split for the GPR case): Use int_reg_operand instead of
	gpc_reg_operand for the output.

gcc/testsuite/
	PR target/70098
	* g++.dg/pr70098.C: New testcase.

---
 gcc/config/rs6000/rs6000.md    | 10 ++---
 gcc/testsuite/g++.dg/pr70098.C | 89 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 94 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr70098.C

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c92c868..f5337dd 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -11909,7 +11909,7 @@ (define_insn "*ctr<mode>_internal1"
 			  (const_int 1))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))
-   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*c*l")
+   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*c*l")
 	(plus:P (match_dup 1)
 		 (const_int -1)))
    (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
@@ -11933,7 +11933,7 @@ (define_insn "*ctr<mode>_internal2"
 			  (const_int 1))
 		      (pc)
 		      (label_ref (match_operand 0 "" ""))))
-   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*c*l")
+   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*c*l")
 	(plus:P (match_dup 1)
 		 (const_int -1)))
    (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
@@ -11959,7 +11959,7 @@ (define_insn "*ctr<mode>_internal5"
 			  (const_int 1))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))
-   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*c*l")
+   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*c*l")
 	(plus:P (match_dup 1)
 		 (const_int -1)))
    (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
@@ -11983,7 +11983,7 @@ (define_insn "*ctr<mode>_internal6"
 			  (const_int 1))
 		      (pc)
 		      (label_ref (match_operand 0 "" ""))))
-   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*c*l")
+   (set (match_operand:P 2 "nonimmediate_operand" "=1,*r,m,*d*c*l")
 	(plus:P (match_dup 1)
 		 (const_int -1)))
    (clobber (match_scratch:CC 3 "=X,&x,&x,&x"))
@@ -12010,7 +12010,7 @@ (define_split
                                       (const_int 1)])
                      (match_operand 5 "" "")
                      (match_operand 6 "" "")))
-   (set (match_operand:P 0 "gpc_reg_operand" "")
+   (set (match_operand:P 0 "int_reg_operand" "")
        (plus:P (match_dup 1) (const_int -1)))
    (clobber (match_scratch:CC 3 ""))
    (clobber (match_scratch:P 4 ""))]
diff --git a/gcc/testsuite/g++.dg/pr70098.C b/gcc/testsuite/g++.dg/pr70098.C
new file mode 100644
index 0000000..c7b4f63
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr70098.C
@@ -0,0 +1,89 @@
+// PR target/70098
+// { dg-do compile }
+// { dg-options -O2 }
+// { dg-require-effective-target c++11 }
+
+template < typename > struct traits;
+template < typename, int _Rows, int _Cols, int = 0, int = _Rows,
+	int = _Cols > class Matrix;
+template < typename > class G;
+template < typename Derived > struct A {
+	typedef G < Derived > type;
+};
+
+template < typename Derived > class C {
+public:
+	enum { RowsAtCompileTime =
+		    traits < Derived >::RowsAtCompileTime } static Zero;
+};
+
+template < typename Derived > class G:public C < Derived > {
+};
+
+template < int _Rows > class D {
+public:
+	long rows() {
+		return _Rows;
+	}
+};
+
+template < typename Derived > class PlainObjectBase:public A < Derived >::type {
+	typedef typename A < Derived >::type Base;
+	D < Base::RowsAtCompileTime > m_storage;
+
+public:
+	long rows() {
+		return m_storage.rows();
+	}
+};
+
+int fn1();
+
+struct B {
+	static long run(long x, long) {
+		int offset(fn1());
+		 return x + offset;
+}};
+
+long fn2(int x)
+{
+	return B::run(x, 0);
+}
+
+template < typename _Scalar, int _Rows, int _Cols, int _Options, int _MaxRows,
+    int _MaxCols >
+    struct traits <Matrix < _Scalar, _Rows, _Cols, _Options, _MaxRows, _MaxCols >> {
+	enum { RowsAtCompileTime = _Rows };
+};
+
+template < typename, int, int, int, int _MaxRows, int _MaxCols >
+	class Matrix:public PlainObjectBase < Matrix < double, _MaxRows,
+	_MaxCols >> {
+public:
+	template < typename OtherDerived > Matrix(OtherDerived);
+};
+
+struct F {
+	static Matrix < double, 2, 2 > run(long size) {
+		Matrix < double, 2, 2 > diag = Matrix < double, 2, 2 >::Zero;
+		long i = 0;
+		while (i < size) {
+			long randomInt = fn2(-1);
+			if (randomInt == 0)
+				++i;
+			else {
+				double alpha(randomInt);
+				 diag = alpha;
+				 i = 2;
+			}
+		}
+
+		return diag;
+	}
+};
+
+void fn3(Matrix < double, 2, 2 > m)
+{
+	long size = m.rows();
+	F::run(size);
+}
-- 
1.9.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rs6000: Handle "d" output in the bd*z patterns (PR70098)
  2016-03-12 13:55 [PATCH] rs6000: Handle "d" output in the bd*z patterns (PR70098) Segher Boessenkool
@ 2016-03-12 14:59 ` David Edelsohn
  2016-03-13 18:52   ` Segher Boessenkool
  0 siblings, 1 reply; 5+ messages in thread
From: David Edelsohn @ 2016-03-12 14:59 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches

On Sat, Mar 12, 2016 at 8:55 AM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> In the rs6000 port, FLOAT_REGS can contain DImode values when compiling
> for 64-bit targets.  Some instructions (like "fcfid" in the testcase,
> convert from integer to DP float) only work on floating point registers.
> So, we do want to allow DImode in these regs.
>
> Now, in unusual cases IRA will assign FLOAT_REGS to some allocno where
> some insns cannot handle FLOAT_REGS there, so they will need a reload.
> Maybe IRA can be made smarter, but it isn't doing anything wrong here,
> so we should be able to handle it.
>
> The place it goes wrong is in the output of the *ctrdi_internal[1256]
> pattern: the "bdz" and "bdnz" instructions.  GCC refuses to do output
> reloads on JUMP_INSNs, probably because it is hard to do, needs different
> strategies than "normal" reloads do, and it cannot even be done at all
> for general patterns.  So JUMP_INSNs need to be able to handle every
> possible output for the register class used.
>
> These patterns already handle writing to "c" (the base insn case), and
> to "r", "m", and "c" or "l"; all those via splitters.  We just need to
> handle "d" as well.  That is what this patch does.  [A predicate in one
> of the splitters needs to be touched up so that the correct splitter
> is used for the FLOAT_REGS case.]
>
> But, that leaves another problem.  One of the insns that are split to
> is a move from a GPR to an FPR.  That work fine on targets with direct
> move (which does exactly that), i.e. power8 and up.  But older targets
> need memory to do the move, and this splitter runs after reload so
> it cannot allocate memory; and allocating memory beforehand for every
> bdnz insn is pretty horrible as well.
>
> This patch implements the easy part.  With it, power8 works, where it
> didn't before.
>
> Tested on powerpc64-linux, -m32 and -m64, -mlra and -mno-lra.  Also
> regstrapping on a power8 powerpc64le-linux.  Is this okay for trunk
> if that works as expected?
>
>
> Segher
>
>
> 2016-03-12  Segher Boessenkool  <segher@kernel.crashing.org>
>
>         PR target/70098
>         * config/rs6000/rs6000.md (*ctr<mode>_internal1, *ctr<mode>_internal2,
>         *ctr<mode>_internal5, *ctr<mode>_internal6): Also allow "d" as output.
>         (define_split for the GPR case): Use int_reg_operand instead of
>         gpc_reg_operand for the output.
>
> gcc/testsuite/
>         PR target/70098
>         * g++.dg/pr70098.C: New testcase.

This is okay.

The testcase will need some XFAILs.

Thanks, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rs6000: Handle "d" output in the bd*z patterns (PR70098)
  2016-03-12 14:59 ` David Edelsohn
@ 2016-03-13 18:52   ` Segher Boessenkool
  2016-03-13 21:55     ` David Edelsohn
  0 siblings, 1 reply; 5+ messages in thread
From: Segher Boessenkool @ 2016-03-13 18:52 UTC (permalink / raw)
  To: David Edelsohn; +Cc: GCC Patches

On Sat, Mar 12, 2016 at 09:59:12AM -0500, David Edelsohn wrote:
> > 2016-03-12  Segher Boessenkool  <segher@kernel.crashing.org>
> >
> >         PR target/70098
> >         * config/rs6000/rs6000.md (*ctr<mode>_internal1, *ctr<mode>_internal2,
> >         *ctr<mode>_internal5, *ctr<mode>_internal6): Also allow "d" as output.
> >         (define_split for the GPR case): Use int_reg_operand instead of
> >         gpc_reg_operand for the output.
> >
> > gcc/testsuite/
> >         PR target/70098
> >         * g++.dg/pr70098.C: New testcase.
> 
> This is okay.
> 
> The testcase will need some XFAILs.

That wasn't so easy.  I came up with the following; okay as well?
(I'll fold it before committing).


Segher


gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_powerpc64_no_dm):
	New function.


diff --git a/gcc/testsuite/g++.dg/pr70098.C b/gcc/testsuite/g++.dg/pr70098.C
index c7b4f63..f5eb48f 100644
--- a/gcc/testsuite/g++.dg/pr70098.C
+++ b/gcc/testsuite/g++.dg/pr70098.C
@@ -2,6 +2,8 @@
 // { dg-do compile }
 // { dg-options -O2 }
 // { dg-require-effective-target c++11 }
+// { dg-xfail-if "PR70098" { lp64 && powerpc64_no_dm } }
+// { dg-prune-output ".*internal compiler error.*" }
 
 template < typename > struct traits;
 template < typename, int _Rows, int _Cols, int = 0, int = _Rows,
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 5af139b..49b82c3 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -1629,6 +1629,19 @@ proc check_effective_target_avx_runtime { } {
     return 0
 }
 
+# Return 1 if we are compiling for 64-bit PowerPC but we do not use direct
+# move instructions for moves from GPR to FPR.
+
+proc check_effective_target_powerpc64_no_dm { } {
+    # The "mulld" checks if we are generating PowerPC64 code.  The "lfd"
+    # checks if we do not use direct moves, but use the old-fashioned
+    # slower move-via-the-stack.
+    return [check_no_messages_and_pattern powerpc64_no_dm \
+	{\mmulld\M.*\mlfd} assembly {
+	    double f(long long x) { return x*x; }
+	} {-O2}]
+}
+
 # Return 1 if the target supports executing power8 vector instructions, 0
 # otherwise.  Cache the result.
 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rs6000: Handle "d" output in the bd*z patterns (PR70098)
  2016-03-13 18:52   ` Segher Boessenkool
@ 2016-03-13 21:55     ` David Edelsohn
  2016-03-13 22:16       ` Segher Boessenkool
  0 siblings, 1 reply; 5+ messages in thread
From: David Edelsohn @ 2016-03-13 21:55 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches

On Sun, Mar 13, 2016 at 2:52 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Sat, Mar 12, 2016 at 09:59:12AM -0500, David Edelsohn wrote:
>> > 2016-03-12  Segher Boessenkool  <segher@kernel.crashing.org>
>> >
>> >         PR target/70098
>> >         * config/rs6000/rs6000.md (*ctr<mode>_internal1, *ctr<mode>_internal2,
>> >         *ctr<mode>_internal5, *ctr<mode>_internal6): Also allow "d" as output.
>> >         (define_split for the GPR case): Use int_reg_operand instead of
>> >         gpc_reg_operand for the output.
>> >
>> > gcc/testsuite/
>> >         PR target/70098
>> >         * g++.dg/pr70098.C: New testcase.
>>
>> This is okay.
>>
>> The testcase will need some XFAILs.
>
> That wasn't so easy.  I came up with the following; okay as well?
> (I'll fold it before committing).
>
>
> Segher
>
>
> gcc/testsuite/
>         * lib/target-supports.exp (check_effective_target_powerpc64_no_dm):
>         New function.

This is okay with me.

Should the testcase go in g++.dg or gcc.target/powerpc?

Thanks, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rs6000: Handle "d" output in the bd*z patterns (PR70098)
  2016-03-13 21:55     ` David Edelsohn
@ 2016-03-13 22:16       ` Segher Boessenkool
  0 siblings, 0 replies; 5+ messages in thread
From: Segher Boessenkool @ 2016-03-13 22:16 UTC (permalink / raw)
  To: David Edelsohn; +Cc: GCC Patches

On Sun, Mar 13, 2016 at 05:54:58PM -0400, David Edelsohn wrote:
> Should the testcase go in g++.dg or gcc.target/powerpc?

I think it is a good thing to test everywhere, not just on Power.
Once we get it to work properly there will be no target-specific
anything left in the testcase, either.


Segher

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-03-13 22:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-12 13:55 [PATCH] rs6000: Handle "d" output in the bd*z patterns (PR70098) Segher Boessenkool
2016-03-12 14:59 ` David Edelsohn
2016-03-13 18:52   ` Segher Boessenkool
2016-03-13 21:55     ` David Edelsohn
2016-03-13 22:16       ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).