public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Fix PR 67609
@ 2015-10-27 20:22 Richard Henderson
  2015-10-27 23:47 ` H.J. Lu
  2015-11-25 17:21 ` James Greenhalgh
  0 siblings, 2 replies; 14+ messages in thread
From: Richard Henderson @ 2015-10-27 20:22 UTC (permalink / raw)
  To: GCC Patches; +Cc: Richard Biener, Marcus Shawcroft, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 1411 bytes --]

The problem in this PR is that a word-mode subreg is used to write to a
multi-word pseudo, under the assumption that is the correct way to insert a
value into the appropriate bits of the pseudo.

Except that the pseudo then gets assigned to an SSE register, at which point
all of the assumptions start to fall apart.  Primarily, (subreg X 0) and
(subreg X 8) do not in the end resolve to different hard registers, so an
assignment to (subreg X 0) may legitimately clobber all of X.

There *are* ways to insert a value into an element of an SSE register, but what
comes out of reload is indistinguishable from a normal DImode assignment.

An ideal solution would be to use a different method than subregs for
multi-word registers.  Using the same code for "view convert", insert, and
extract is going to continue to cause us problems.

I had a look over the other major vector targets:

  * ppc, sparc, s390 already disallow the described condition.

  * arm ought not be subject to this problem because each vector register is
composed of 4 individually addressable registers.  So when (subreg X N) is
simplified, we really do resolve to the desired hard register.

  * aarch64 is almost certainly vulnerable, since it deleted its
CANNOT_CHANGE_MODE_CLASS implementation in January.  I haven't tried to create
a test case that fails for it, but I'm certain it's possible.


Tested on x86_64 and committed.


r~

[-- Attachment #2: zz --]
[-- Type: text/plain, Size: 5783 bytes --]

	PR rtl-opt/67609
	* config/i386/i386.c (ix86_cannot_change_mode_class): Disallow
	narrowing subregs on SSE and MMX registers.
	* doc/tm.texi.in (CANNOT_CHANGE_MODE_CLASS): Clarify when subregs that
	appear to be sub-words of multi-register pseudos must be rejected.
	* doc/tm.texi: Regenerate.
testsuite/
	* gcc.target/i386/pr67609-2.c: New test.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d05c8f8..82fd054 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -43031,15 +43031,22 @@ ix86_cannot_change_mode_class (machine_mode from, machine_mode to,
   if (MAYBE_FLOAT_CLASS_P (regclass))
     return true;
 
+  /* Vector registers do not support QI or HImode loads.  If we don't
+     disallow a change to these modes, reload will assume it's ok to
+     drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
+     the vec_dupv4hi pattern.
+
+     Further, we cannot allow word_mode subregs of full vector modes.
+     Otherwise the middle-end will assume it's ok to store to
+     (subreg:DI (reg:TI 100) 0) in order to modify only the low 64 bits
+     of the 128-bit register.  However, after reload the subreg will
+     be dropped leaving a plain DImode store.  This is indistinguishable
+     from a "normal" DImode move, and so we're justified to use movsd,
+     which modifies the entire 128-bit register.
+
+     Combining these two conditions, disallow all narrowing mode changes.  */
   if (MAYBE_SSE_CLASS_P (regclass) || MAYBE_MMX_CLASS_P (regclass))
-    {
-      /* Vector registers do not support QI or HImode loads.  If we don't
-	 disallow a change to these modes, reload will assume it's ok to
-	 drop the subreg from (subreg:SI (reg:HI 100) 0).  This affects
-	 the vec_dupv4hi pattern.  */
-      if (GET_MODE_SIZE (from) < 4)
-	return true;
-    }
+    return GET_MODE_SIZE (to) < GET_MODE_SIZE (from);
 
   return false;
 }
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index a8666b1..606ddb6 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -2823,8 +2823,8 @@ in the reload pass.
 If defined, a C expression that returns nonzero for a @var{class} for which
 a change from mode @var{from} to mode @var{to} is invalid.
 
-For the example, loading 32-bit integer or floating-point objects into
-floating-point registers on the Alpha extends them to 64 bits.
+For example, loading 32-bit integer or floating-point objects into
+floating-point registers on Alpha extends them to 64 bits.
 Therefore loading a 64-bit object and then storing it as a 32-bit object
 does not store the low-order 32 bits, as would be the case for a normal
 register.  Therefore, @file{alpha.h} defines @code{CANNOT_CHANGE_MODE_CLASS}
@@ -2835,6 +2835,17 @@ as below:
   (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \
    ? reg_classes_intersect_p (FLOAT_REGS, (CLASS)) : 0)
 @end smallexample
+
+Even if storing from a register in mode @var{to} would be valid,
+if both @var{from} and @code{raw_reg_mode} for @var{class} are wider
+than @code{word_mode}, then we must prevent @var{to} narrowing the
+mode.  This happens when the middle-end assumes that it can load
+or store pieces of an @var{N}-word pseudo, and that the pseudo will
+eventually be allocated to @var{N} @code{word_mode} hard registers.
+Failure to prevent this kind of mode change will result in the
+entire @code{raw_reg_mode} being modified instead of the partial
+value that the middle-end intended.
+
 @end defmac
 
 @deftypefn {Target Hook} reg_class_t TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS (int, @var{reg_class_t})
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 69b6cf9..93620eb 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2461,8 +2461,8 @@ in the reload pass.
 If defined, a C expression that returns nonzero for a @var{class} for which
 a change from mode @var{from} to mode @var{to} is invalid.
 
-For the example, loading 32-bit integer or floating-point objects into
-floating-point registers on the Alpha extends them to 64 bits.
+For example, loading 32-bit integer or floating-point objects into
+floating-point registers on Alpha extends them to 64 bits.
 Therefore loading a 64-bit object and then storing it as a 32-bit object
 does not store the low-order 32 bits, as would be the case for a normal
 register.  Therefore, @file{alpha.h} defines @code{CANNOT_CHANGE_MODE_CLASS}
@@ -2473,6 +2473,17 @@ as below:
   (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \
    ? reg_classes_intersect_p (FLOAT_REGS, (CLASS)) : 0)
 @end smallexample
+
+Even if storing from a register in mode @var{to} would be valid,
+if both @var{from} and @code{raw_reg_mode} for @var{class} are wider
+than @code{word_mode}, then we must prevent @var{to} narrowing the
+mode.  This happens when the middle-end assumes that it can load
+or store pieces of an @var{N}-word pseudo, and that the pseudo will
+eventually be allocated to @var{N} @code{word_mode} hard registers.
+Failure to prevent this kind of mode change will result in the
+entire @code{raw_reg_mode} being modified instead of the partial
+value that the middle-end intended.
+
 @end defmac
 
 @hook TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS
diff --git a/gcc/testsuite/gcc.target/i386/pr67609-2.c b/gcc/testsuite/gcc.target/i386/pr67609-2.c
new file mode 100644
index 0000000..fece437
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr67609-2.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -msse2" } */
+/* { dg-require-effective-target sse2 } */
+
+#include <stdlib.h>
+#include <emmintrin.h>
+
+__m128d reg = { 2.0, 4.0 };
+
+void
+__attribute__((noinline))
+set_lower (double b)
+{
+  double v[2];
+  _mm_store_pd(v, reg);
+  v[0] = b;
+  reg = _mm_load_pd(v);
+}
+
+int
+main ()
+{
+  set_lower (6.0);
+
+  if (reg[1] != 4.0)
+    abort ();
+
+  return 0;
+}

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-01-12 10:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-27 20:22 Fix PR 67609 Richard Henderson
2015-10-27 23:47 ` H.J. Lu
2015-10-27 23:49   ` H.J. Lu
2015-11-25 17:21 ` James Greenhalgh
2015-11-26 11:02   ` Richard Henderson
2015-11-27 13:09     ` [Patch AArch64] Reinstate CANNOT_CHANGE_MODE_CLASS to fix pr67609 James Greenhalgh
2015-11-27 14:02       ` Richard Biener
2015-12-09 11:45       ` James Greenhalgh
2015-12-09 13:13       ` Marcus Shawcroft
2015-12-14 11:01         ` James Greenhalgh
2015-12-14 11:49           ` Marcus Shawcroft
2015-12-18 12:13             ` Backport: " James Greenhalgh
2016-01-11 12:12               ` James Greenhalgh
2016-01-12 10:32               ` Marcus Shawcroft

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).