public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
@ 2015-02-22 18:49 David Edelsohn
  2015-02-23 16:08 ` Richard Biener
  2015-02-24  7:22 ` Martin Sebor
  0 siblings, 2 replies; 19+ messages in thread
From: David Edelsohn @ 2015-02-22 18:49 UTC (permalink / raw)
  To: Martin Sebor; +Cc: GCC Patches

Does this patch really fix the problem?  The PR notes that the
testcase fails and code quality has regressed.  Has the code
generation been corrected but the testcase looks for the wrong string?
 Presumably the message that basic block was vectorized means that the
code generation is correct, but the commentary about the patch does
not mention it.

Thanks, David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-02-22 18:49 [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1 David Edelsohn
@ 2015-02-23 16:08 ` Richard Biener
  2015-02-24  7:22 ` Martin Sebor
  1 sibling, 0 replies; 19+ messages in thread
From: Richard Biener @ 2015-02-23 16:08 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Martin Sebor, GCC Patches

On Sun, Feb 22, 2015 at 7:45 PM, David Edelsohn <dje.gcc@gmail.com> wrote:
> Does this patch really fix the problem?  The PR notes that the
> testcase fails and code quality has regressed.  Has the code
> generation been corrected but the testcase looks for the wrong string?
>  Presumably the message that basic block was vectorized means that the
> code generation is correct, but the commentary about the patch does
> not mention it.

We certainly no where dump "basic block vectorized using SLP" but
"basic block vectorized" and "Basic block will be vectorized using SLP"

Richard.

> Thanks, David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-02-22 18:49 [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1 David Edelsohn
  2015-02-23 16:08 ` Richard Biener
@ 2015-02-24  7:22 ` Martin Sebor
  2015-02-24 20:12   ` Jeff Law
  1 sibling, 1 reply; 19+ messages in thread
From: Martin Sebor @ 2015-02-24  7:22 UTC (permalink / raw)
  To: David Edelsohn; +Cc: GCC Patches

On 02/22/2015 11:45 AM, David Edelsohn wrote:
> Does this patch really fix the problem?  The PR notes that the
> testcase fails and code quality has regressed.  Has the code
> generation been corrected but the testcase looks for the wrong string?
>   Presumably the message that basic block was vectorized means that the
> code generation is correct, but the commentary about the patch does
> not mention it.

There appear to be at least three problems at play here:

1) The test expects the wrong string to determine success.

2) GCC 4.9.0 and later emit suboptimal code compared to 4.8.4.

3) With (1) fixed, the test fails to detect (2).

During my initial investigation, besides trunk, I had only looked
at the assembly emitted at revision 198852 since there the test
is reported as passing in comment #2. The code appears comparable
between the two.

Now that I've also compared the assembly emitted by 4.8.4 I see
what I suspect the original reporter was referring to: 4.9.0 and
later both uses vectorization to copy the arrays and also assigns
the four elements using ordinary loads and stores.  And since
the code has been successfully vectorized (and GCC reports it
in the dump) the test passes.

I'll need to spend some more time to find the revision that
caused this.

Martin

PS For reference, the assembly emitted by 4.8.4 for powerpc-linux
is as follows:

main1:
	lis 10,.LANCHOR0@ha
	la 10,.LANCHOR0@l(10)
	addi 9,10,4
	lvx 1,0,9
	neg 8,9
	lis 9,.LANCHOR1@ha
	lvsr 13,0,8
	addi 10,10,16
	la 9,.LANCHOR1@l(9)
	lvx 0,0,10
	li 3,0
	vperm 0,1,0,13
	stvx 0,0,9
	blr

...while 5.0.0 20150223 emits this:

main1:
	lis 7,.LANCHOR0@ha
	stwu 1,-32(1)
	la 7,.LANCHOR0@l(7)
	li 3,0
	addi 5,7,4
	addi 7,7,16
	rlwinm 6,5,0,0,27
	lvx 0,0,7
	lwz 9,4(6)
	addi 7,1,16
	lwz 11,12(6)
	neg 5,5
	lwz 10,8(6)
	lwz 8,0(6)
	lvsr 1,0,5
	stw 9,4(7)
	lis 9,.LANCHOR1@ha
	stw 8,0(7)
	la 9,.LANCHOR1@l(9)
	stw 10,8(7)
	stw 11,12(7)
	lvx 13,0,7
	vperm 0,13,0,1
	stvx 0,0,9
	addi 1,1,32
	blr

powerpc64-linux has a similar problem and emits:

main1:
         .quad   .L.main1,.TOC.@tocbase,0
         .previous
         .type   main1, @function
.L.main1:
         addis 10,2,.LANCHOR0@toc@ha
         li 3,0
         addi 10,10,.LANCHOR0@toc@l
         addi 9,10,4
         addi 8,10,16
         neg 7,9
         rldicr 9,9,0,59
         lvx 0,0,8
         addi 8,1,-16
         ld 11,8(9)
         ld 10,0(9)
         lvsr 1,0,7
         addis 9,2,.LANCHOR1@toc@ha
         addi 9,9,.LANCHOR1@toc@l
         std 10,0(8)
         std 11,8(8)
         ori 2,2,0
         lvx 13,0,8
         vperm 0,13,0,1
         stvx 0,0,9
         blr

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-02-24  7:22 ` Martin Sebor
@ 2015-02-24 20:12   ` Jeff Law
  2015-02-27 14:30     ` David Edelsohn
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff Law @ 2015-02-24 20:12 UTC (permalink / raw)
  To: Martin Sebor, David Edelsohn; +Cc: GCC Patches

On 02/23/15 20:38, Martin Sebor wrote:
> On 02/22/2015 11:45 AM, David Edelsohn wrote:
>> Does this patch really fix the problem?  The PR notes that the
>> testcase fails and code quality has regressed.  Has the code
>> generation been corrected but the testcase looks for the wrong string?
>>   Presumably the message that basic block was vectorized means that the
>> code generation is correct, but the commentary about the patch does
>> not mention it.
>
> There appear to be at least three problems at play here:
>
> 1) The test expects the wrong string to determine success.
>
> 2) GCC 4.9.0 and later emit suboptimal code compared to 4.8.4.
>
> 3) With (1) fixed, the test fails to detect (2).
>
> During my initial investigation, besides trunk, I had only looked
> at the assembly emitted at revision 198852 since there the test
> is reported as passing in comment #2. The code appears comparable
> between the two.
>
> Now that I've also compared the assembly emitted by 4.8.4 I see
> what I suspect the original reporter was referring to: 4.9.0 and
> later both uses vectorization to copy the arrays and also assigns
> the four elements using ordinary loads and stores.  And since
> the code has been successfully vectorized (and GCC reports it
> in the dump) the test passes.
>
> I'll need to spend some more time to find the revision that
> caused this.
Something to bear in mind, this may turn out to be something that isn't 
fixable at this stage in development.  So please stay in contact with 
your findings.

Regardless, we should find a way to change the testcase so that it can 
correctly identify the missed optimization.

Jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-02-24 20:12   ` Jeff Law
@ 2015-02-27 14:30     ` David Edelsohn
  2015-02-27 15:59       ` Martin Sebor
  0 siblings, 1 reply; 19+ messages in thread
From: David Edelsohn @ 2015-02-27 14:30 UTC (permalink / raw)
  To: Jeff Law, Richard Biener; +Cc: Martin Sebor, GCC Patches

On Tue, Feb 24, 2015 at 2:30 PM, Jeff Law <law@redhat.com> wrote:
> On 02/23/15 20:38, Martin Sebor wrote:
>>
>> On 02/22/2015 11:45 AM, David Edelsohn wrote:
>>>
>>> Does this patch really fix the problem?  The PR notes that the
>>> testcase fails and code quality has regressed.  Has the code
>>> generation been corrected but the testcase looks for the wrong string?
>>>   Presumably the message that basic block was vectorized means that the
>>> code generation is correct, but the commentary about the patch does
>>> not mention it.
>>
>>
>> There appear to be at least three problems at play here:
>>
>> 1) The test expects the wrong string to determine success.
>>
>> 2) GCC 4.9.0 and later emit suboptimal code compared to 4.8.4.
>>
>> 3) With (1) fixed, the test fails to detect (2).
>>
>> During my initial investigation, besides trunk, I had only looked
>> at the assembly emitted at revision 198852 since there the test
>> is reported as passing in comment #2. The code appears comparable
>> between the two.
>>
>> Now that I've also compared the assembly emitted by 4.8.4 I see
>> what I suspect the original reporter was referring to: 4.9.0 and
>> later both uses vectorization to copy the arrays and also assigns
>> the four elements using ordinary loads and stores.  And since
>> the code has been successfully vectorized (and GCC reports it
>> in the dump) the test passes.
>>
>> I'll need to spend some more time to find the revision that
>> caused this.
>
> Something to bear in mind, this may turn out to be something that isn't
> fixable at this stage in development.  So please stay in contact with your
> findings.
>
> Regardless, we should find a way to change the testcase so that it can
> correctly identify the missed optimization.

The fix to the testcase is fine with me.

Given that Martin's fix to the testcase allowed it to succeed without
Richi's fix for the underlying problem, is there a modification to the
testcase or a new testcase that would really test the optimization?

Thanks, David

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-02-27 14:30     ` David Edelsohn
@ 2015-02-27 15:59       ` Martin Sebor
  2015-02-28  1:30         ` Martin Sebor
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Sebor @ 2015-02-27 15:59 UTC (permalink / raw)
  To: David Edelsohn, Jeff Law, Richard Biener; +Cc: GCC Patches

On 02/27/2015 07:27 AM, David Edelsohn wrote:
> On Tue, Feb 24, 2015 at 2:30 PM, Jeff Law <law@redhat.com> wrote:
>> On 02/23/15 20:38, Martin Sebor wrote:
>>>
>>> On 02/22/2015 11:45 AM, David Edelsohn wrote:
>>>>
>>>> Does this patch really fix the problem?  The PR notes that the
>>>> testcase fails and code quality has regressed.  Has the code
>>>> generation been corrected but the testcase looks for the wrong string?
>>>>    Presumably the message that basic block was vectorized means that the
>>>> code generation is correct, but the commentary about the patch does
>>>> not mention it.
>>>
>>>
>>> There appear to be at least three problems at play here:
>>>
>>> 1) The test expects the wrong string to determine success.
>>>
>>> 2) GCC 4.9.0 and later emit suboptimal code compared to 4.8.4.
>>>
>>> 3) With (1) fixed, the test fails to detect (2).
>>>
>>> During my initial investigation, besides trunk, I had only looked
>>> at the assembly emitted at revision 198852 since there the test
>>> is reported as passing in comment #2. The code appears comparable
>>> between the two.
>>>
>>> Now that I've also compared the assembly emitted by 4.8.4 I see
>>> what I suspect the original reporter was referring to: 4.9.0 and
>>> later both uses vectorization to copy the arrays and also assigns
>>> the four elements using ordinary loads and stores.  And since
>>> the code has been successfully vectorized (and GCC reports it
>>> in the dump) the test passes.
>>>
>>> I'll need to spend some more time to find the revision that
>>> caused this.
>>
>> Something to bear in mind, this may turn out to be something that isn't
>> fixable at this stage in development.  So please stay in contact with your
>> findings.
>>
>> Regardless, we should find a way to change the testcase so that it can
>> correctly identify the missed optimization.
>
> The fix to the testcase is fine with me.
>
> Given that Martin's fix to the testcase allowed it to succeed without
> Richi's fix for the underlying problem, is there a modification to the
> testcase or a new testcase that would really test the optimization?

Let me work on it.

Martin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-02-27 15:59       ` Martin Sebor
@ 2015-02-28  1:30         ` Martin Sebor
  2015-03-02 13:58           ` Richard Biener
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Sebor @ 2015-02-28  1:30 UTC (permalink / raw)
  To: David Edelsohn, Jeff Law, Richard Biener; +Cc: GCC Patches

>> Given that Martin's fix to the testcase allowed it to succeed without
>> Richi's fix for the underlying problem, is there a modification to the
>> testcase or a new testcase that would really test the optimization?
>
> Let me work on it.

Below is a patch with a couple of minor tweaks to the existing
test first to update the search string and second to better
exercise the vectorization not only when the source address
isn't aligned on the expected boundary but also when the
destination address isn't.  This enhancement revealed
an outstanding aspect of the regression (not fixed by Richard's
already committed patch).

Besides this change, the patch also adds a number of other
tests to better exercise the vectorization by verifying it
takes place for arrays of elements of other sizes besides
word: byte, half word, and double word.  Those tests reveal
both another regression WRT 4.8 and further vectorization
opportunities not exploited even in 4.8.  I marked the latter
XFAIL in the tests so that when the regression is fully
resolved, the tests should pass with no unexpected failures.

Martin

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index cc86e37..4edd559 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,15 @@
+2015-02-27  Martin Sebor  <msebor@redhat.com>
+
+	PR testsuite/63175
+	* gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c (main1): Rename...
+	(copy_to_unaligned): ...to this and move checking of results into
+	main.
+	(copy_from_unaligned): New function.
+	* costmodel-bb-slp-pr63175-base.c: New test.
+	* costmodel-bb-slp-pr63175-dword.c: New test.
+	* costmodel-bb-slp-pr63175-hword.c: New test.
+	* costmodel-bb-slp-pr63175-word.c: New test.
+
  2015-02-27  Jakub Jelinek  <jakub@redhat.com>

  	PR tree-optimization/65048
diff --git 
a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
index e1bc1a8..a2dc367 100644
--- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
@@ -1,46 +1,63 @@
  /* { dg-require-effective-target vect_int } */

-#include <stdarg.h>
  #include "../../tree-vect.h"

  #define N 16

  unsigned int out[N];
-unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+const unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};

-__attribute__ ((noinline)) int
-main1 (unsigned int x, unsigned int y)
+__attribute__ ((noinline)) void
+copy_to_unaligned (void)
  {
-  int i;
-  unsigned int *pin = &in[1];
+  const unsigned int *pin = &in[1];
    unsigned int *pout = &out[0];
-  unsigned int a0, a1, a2, a3;

    /* Misaligned load.  */
    *pout++ = *pin++;
    *pout++ = *pin++;
    *pout++ = *pin++;
    *pout++ = *pin++;
+}

-  /* Check results.  */
-  if (out[0] != in[1]
-      || out[1] != in[2]
-      || out[2] != in[3]
-      || out[3] != in[4])
-    abort();
+__attribute__ ((noinline)) void
+copy_from_unaligned (void)
+{
+  const unsigned int *pin = &in[0];
+  unsigned int *pout = &out[1];

-  return 0;
+  /* Misaligned load.  */
+  *pout++ = *pin++;
+  *pout++ = *pin++;
+  *pout++ = *pin++;
+  *pout++ = *pin++;
  }

  int main (void)
  {
    check_vect ();

-  main1 (2, 3);
+  copy_to_unaligned ();
+
+  /* Check results outside of main1 where it would likely
+     be optimized away.  */
+  if (out[0] != in[1]
+      || out[1] != in[2]
+      || out[2] != in[3]
+      || out[3] != in[4])
+    abort();
+
+  copy_from_unaligned ();
+
+  if (out[1] != in[0]
+      || out[2] != in[1]
+      || out[3] != in[2]
+      || out[4] != in[3])
+    abort();

    return 0;
  }

-/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 
1 "slp2"  { xfail  vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 2 "slp2" 
  { xfail  vect_no_align } } } */
  /* { dg-final { cleanup-tree-dump "slp2" } } */

diff --git 
a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c
new file mode 100644
index 0000000..b94cf0e
--- /dev/null
+++ 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c
@@ -0,0 +1,45 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+
+#define DEFINE_DATA(T) \
+    const T T ## _ ## src[] = { \
+         0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,\
+        16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 
32 }; \
+    T T ## _ ## dst[sizeof T ## _ ## src / sizeof (T)]
+
+#define DEFINE_COPY_M_N(T, srcoff, dstoff) \
+void copy_ ## T ## _ ## srcoff ## _ ## dstoff (void) { \
+    const T *s = T ## _ ## src + srcoff; \
+    T *d = T ## _ ## dst + dstoff; \
+    unsigned i; \
+    for (i = 0; i != 16 / sizeof *s; ++i) \
+        *d++ = *s++; \
+}
+
+#define DEFINE_COPY_M(T, M) \
+    DEFINE_COPY_M_N (T, M, 0); \
+    DEFINE_COPY_M_N (T, M, 1); \
+    DEFINE_COPY_M_N (T, M, 2); \
+    DEFINE_COPY_M_N (T, M, 3)
+
+#define TEST_COPY(T) \
+    DEFINE_DATA (T); \
+    DEFINE_COPY_M (T, 0); \
+    DEFINE_COPY_M (T, 1); \
+    DEFINE_COPY_M (T, 2); \
+    DEFINE_COPY_M (T, 3)
+
+#ifndef Type
+#  define Type char
+#endif
+
+TEST_COPY (Type);
+
+/* Verify that the assembly contains vector instructions alone
+   with no byte loads (lb, lbu, lbz, lbzu, or their indexed forms)
+   or byte stores (stb, stbu, stbx, stbux, or their indexed
+   forms).  */
+
+/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
+/* { dg-final { scan-assembler-not "\tlbz?u?x? " { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not "\tstbu?x? " { xfail *-*-* } } } */
diff --git 
a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-dword.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-dword.c
new file mode 100644
index 0000000..87d5cb6
--- /dev/null
+++ 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-dword.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+
+#define Type long
+#include "costmodel-bb-slp-pr63175-base.c"
+
+/* Verify that the assembly contains vector instructions alone
+   with no doubleword loads (ld, ldu, or their indexed forms)
+   or stores (std, stdu, or their indexed forms).  */
+
+/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
+/* { dg-final { scan-assembler-not "\tldu?x? " } } */
+/* { dg-final { scan-assembler-not "\tstdu?x? " } } */
diff --git 
a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-hword.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-hword.c
new file mode 100644
index 0000000..8c22294
--- /dev/null
+++ 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-hword.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+
+#define Type short
+#include "costmodel-bb-slp-pr63175-base.c"
+
+/* Verify that the assembly contains vector instructions alone
+   with no halfword loads (lh, lhz, lhzu or their indexed forms)
+   or halfword stores (sth, sthu, or their indexed forms).  */
+
+/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
+/* { dg-final { scan-assembler-not "\tlhz?u?x? " { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not "\tsthu?x? " { xfail *-*-* } } } */
diff --git 
a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-word.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-word.c
new file mode 100644
index 0000000..942b8a6
--- /dev/null
+++ 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-word.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+
+#define Type int
+#include "costmodel-bb-slp-pr63175-base.c"
+
+/* Verify that the assembly contains vector instructions alone
+   with no word loads (lw, lwz, lwzu or their indexed forms
+   or word stores (stw or stwu, or their indexed forms).  */
+
+/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
+/* { dg-final { scan-assembler-not "\tlwz?u?x? " } } */
+/* { dg-final { scan-assembler-not "\tstwu?x? " } } */

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-02-28  1:30         ` Martin Sebor
@ 2015-03-02 13:58           ` Richard Biener
  2015-03-02 16:28             ` Martin Sebor
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Biener @ 2015-03-02 13:58 UTC (permalink / raw)
  To: Martin Sebor; +Cc: David Edelsohn, Jeff Law, GCC Patches

On Fri, 27 Feb 2015, Martin Sebor wrote:

> > > Given that Martin's fix to the testcase allowed it to succeed without
> > > Richi's fix for the underlying problem, is there a modification to the
> > > testcase or a new testcase that would really test the optimization?
> > 
> > Let me work on it.
> 
> Below is a patch with a couple of minor tweaks to the existing
> test first to update the search string and second to better
> exercise the vectorization not only when the source address
> isn't aligned on the expected boundary but also when the
> destination address isn't.  This enhancement revealed
> an outstanding aspect of the regression (not fixed by Richard's
> already committed patch).
> 
> Besides this change, the patch also adds a number of other
> tests to better exercise the vectorization by verifying it
> takes place for arrays of elements of other sizes besides
> word: byte, half word, and double word.  Those tests reveal
> both another regression WRT 4.8 and further vectorization
> opportunities not exploited even in 4.8.  I marked the latter
> XFAIL in the tests so that when the regression is fully
> resolved, the tests should pass with no unexpected failures.

I have a hard time applying the patch because of line-wrapping issues
or my patch tool not groking the git diffs.

Can you please either commit the patch or extract the testcase
that still regresses and paste it into PR63175?

Thanks,
Richard.

> Martin
> 
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index cc86e37..4edd559 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,15 @@
> +2015-02-27  Martin Sebor  <msebor@redhat.com>
> +
> +	PR testsuite/63175
> +	* gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c (main1): Rename...
> +	(copy_to_unaligned): ...to this and move checking of results into
> +	main.
> +	(copy_from_unaligned): New function.
> +	* costmodel-bb-slp-pr63175-base.c: New test.
> +	* costmodel-bb-slp-pr63175-dword.c: New test.
> +	* costmodel-bb-slp-pr63175-hword.c: New test.
> +	* costmodel-bb-slp-pr63175-word.c: New test.
> +
>  2015-02-27  Jakub Jelinek  <jakub@redhat.com>
> 
>  	PR tree-optimization/65048
> diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
> b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
> index e1bc1a8..a2dc367 100644
> --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
> +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
> @@ -1,46 +1,63 @@
>  /* { dg-require-effective-target vect_int } */
> 
> -#include <stdarg.h>
>  #include "../../tree-vect.h"
> 
>  #define N 16
> 
>  unsigned int out[N];
> -unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
> +const unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
> 
> -__attribute__ ((noinline)) int
> -main1 (unsigned int x, unsigned int y)
> +__attribute__ ((noinline)) void
> +copy_to_unaligned (void)
>  {
> -  int i;
> -  unsigned int *pin = &in[1];
> +  const unsigned int *pin = &in[1];
>    unsigned int *pout = &out[0];
> -  unsigned int a0, a1, a2, a3;
> 
>    /* Misaligned load.  */
>    *pout++ = *pin++;
>    *pout++ = *pin++;
>    *pout++ = *pin++;
>    *pout++ = *pin++;
> +}
> 
> -  /* Check results.  */
> -  if (out[0] != in[1]
> -      || out[1] != in[2]
> -      || out[2] != in[3]
> -      || out[3] != in[4])
> -    abort();
> +__attribute__ ((noinline)) void
> +copy_from_unaligned (void)
> +{
> +  const unsigned int *pin = &in[0];
> +  unsigned int *pout = &out[1];
> 
> -  return 0;
> +  /* Misaligned load.  */
> +  *pout++ = *pin++;
> +  *pout++ = *pin++;
> +  *pout++ = *pin++;
> +  *pout++ = *pin++;
>  }
> 
>  int main (void)
>  {
>    check_vect ();
> 
> -  main1 (2, 3);
> +  copy_to_unaligned ();
> +
> +  /* Check results outside of main1 where it would likely
> +     be optimized away.  */
> +  if (out[0] != in[1]
> +      || out[1] != in[2]
> +      || out[2] != in[3]
> +      || out[3] != in[4])
> +    abort();
> +
> +  copy_from_unaligned ();
> +
> +  if (out[1] != in[0]
> +      || out[2] != in[1]
> +      || out[3] != in[2]
> +      || out[4] != in[3])
> +    abort();
> 
>    return 0;
>  }
> 
> -/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1
> "slp2"  { xfail  vect_no_align } } } */
> +/* { dg-final { scan-tree-dump-times "basic block vectorized" 2 "slp2"  {
> xfail  vect_no_align } } } */
>  /* { dg-final { cleanup-tree-dump "slp2" } } */
> 
> diff --git
> a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c
> b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c
> new file mode 100644
> index 0000000..b94cf0e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c
> @@ -0,0 +1,45 @@
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-do compile } */
> +
> +#define DEFINE_DATA(T) \
> +    const T T ## _ ## src[] = { \
> +         0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,\
> +        16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 };
> \
> +    T T ## _ ## dst[sizeof T ## _ ## src / sizeof (T)]
> +
> +#define DEFINE_COPY_M_N(T, srcoff, dstoff) \
> +void copy_ ## T ## _ ## srcoff ## _ ## dstoff (void) { \
> +    const T *s = T ## _ ## src + srcoff; \
> +    T *d = T ## _ ## dst + dstoff; \
> +    unsigned i; \
> +    for (i = 0; i != 16 / sizeof *s; ++i) \
> +        *d++ = *s++; \
> +}
> +
> +#define DEFINE_COPY_M(T, M) \
> +    DEFINE_COPY_M_N (T, M, 0); \
> +    DEFINE_COPY_M_N (T, M, 1); \
> +    DEFINE_COPY_M_N (T, M, 2); \
> +    DEFINE_COPY_M_N (T, M, 3)
> +
> +#define TEST_COPY(T) \
> +    DEFINE_DATA (T); \
> +    DEFINE_COPY_M (T, 0); \
> +    DEFINE_COPY_M (T, 1); \
> +    DEFINE_COPY_M (T, 2); \
> +    DEFINE_COPY_M (T, 3)
> +
> +#ifndef Type
> +#  define Type char
> +#endif
> +
> +TEST_COPY (Type);
> +
> +/* Verify that the assembly contains vector instructions alone
> +   with no byte loads (lb, lbu, lbz, lbzu, or their indexed forms)
> +   or byte stores (stb, stbu, stbx, stbux, or their indexed
> +   forms).  */
> +
> +/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
> +/* { dg-final { scan-assembler-not "\tlbz?u?x? " { xfail *-*-* } } } */
> +/* { dg-final { scan-assembler-not "\tstbu?x? " { xfail *-*-* } } } */
> diff --git
> a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-dword.c
> b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-dword.c
> new file mode 100644
> index 0000000..87d5cb6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-dword.c
> @@ -0,0 +1,13 @@
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-do compile } */
> +
> +#define Type long
> +#include "costmodel-bb-slp-pr63175-base.c"
> +
> +/* Verify that the assembly contains vector instructions alone
> +   with no doubleword loads (ld, ldu, or their indexed forms)
> +   or stores (std, stdu, or their indexed forms).  */
> +
> +/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
> +/* { dg-final { scan-assembler-not "\tldu?x? " } } */
> +/* { dg-final { scan-assembler-not "\tstdu?x? " } } */
> diff --git
> a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-hword.c
> b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-hword.c
> new file mode 100644
> index 0000000..8c22294
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-hword.c
> @@ -0,0 +1,13 @@
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-do compile } */
> +
> +#define Type short
> +#include "costmodel-bb-slp-pr63175-base.c"
> +
> +/* Verify that the assembly contains vector instructions alone
> +   with no halfword loads (lh, lhz, lhzu or their indexed forms)
> +   or halfword stores (sth, sthu, or their indexed forms).  */
> +
> +/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
> +/* { dg-final { scan-assembler-not "\tlhz?u?x? " { xfail *-*-* } } } */
> +/* { dg-final { scan-assembler-not "\tsthu?x? " { xfail *-*-* } } } */
> diff --git
> a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-word.c
> b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-word.c
> new file mode 100644
> index 0000000..942b8a6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-word.c
> @@ -0,0 +1,13 @@
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-do compile } */
> +
> +#define Type int
> +#include "costmodel-bb-slp-pr63175-base.c"
> +
> +/* Verify that the assembly contains vector instructions alone
> +   with no word loads (lw, lwz, lwzu or their indexed forms
> +   or word stores (stw or stwu, or their indexed forms).  */
> +
> +/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
> +/* { dg-final { scan-assembler-not "\tlwz?u?x? " } } */
> +/* { dg-final { scan-assembler-not "\tstwu?x? " } } */
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-03-02 13:58           ` Richard Biener
@ 2015-03-02 16:28             ` Martin Sebor
  2015-03-05 23:44               ` Martin Sebor
  2015-03-06 17:28               ` Jeff Law
  0 siblings, 2 replies; 19+ messages in thread
From: Martin Sebor @ 2015-03-02 16:28 UTC (permalink / raw)
  To: Richard Biener; +Cc: David Edelsohn, Jeff Law, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1640 bytes --]

On 03/02/2015 06:58 AM, Richard Biener wrote:
> On Fri, 27 Feb 2015, Martin Sebor wrote:
>
>>>> Given that Martin's fix to the testcase allowed it to succeed without
>>>> Richi's fix for the underlying problem, is there a modification to the
>>>> testcase or a new testcase that would really test the optimization?
>>>
>>> Let me work on it.
>>
>> Below is a patch with a couple of minor tweaks to the existing
>> test first to update the search string and second to better
>> exercise the vectorization not only when the source address
>> isn't aligned on the expected boundary but also when the
>> destination address isn't.  This enhancement revealed
>> an outstanding aspect of the regression (not fixed by Richard's
>> already committed patch).
>>
>> Besides this change, the patch also adds a number of other
>> tests to better exercise the vectorization by verifying it
>> takes place for arrays of elements of other sizes besides
>> word: byte, half word, and double word.  Those tests reveal
>> both another regression WRT 4.8 and further vectorization
>> opportunities not exploited even in 4.8.  I marked the latter
>> XFAIL in the tests so that when the regression is fully
>> resolved, the tests should pass with no unexpected failures.
>
> I have a hard time applying the patch because of line-wrapping issues
> or my patch tool not groking the git diffs.
>
> Can you please either commit the patch or extract the testcase
> that still regresses and paste it into PR63175?

I pasted a couple of such test cases to the bug. The full patch
is also attached to this email in case there was a problem with
line wrapping.

Martin


[-- Attachment #2: pr63175.patch --]
[-- Type: text/x-patch, Size: 7050 bytes --]

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index cc86e37..4edd559 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,15 @@
+2015-02-27  Martin Sebor  <msebor@redhat.com>
+
+	PR testsuite/63175
+	* gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c (main1): Rename...
+	(copy_to_unaligned): ...to this and move checking of results into
+	main.
+	(copy_from_unaligned): New function.
+	* costmodel-bb-slp-pr63175-base.c: New test.
+	* costmodel-bb-slp-pr63175-dword.c: New test.
+	* costmodel-bb-slp-pr63175-hword.c: New test.
+	* costmodel-bb-slp-pr63175-word.c: New test.
+
 2015-02-27  Jakub Jelinek  <jakub@redhat.com>
 
 	PR tree-optimization/65048
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
index e1bc1a8..a2dc367 100644
--- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
@@ -1,46 +1,63 @@
 /* { dg-require-effective-target vect_int } */
 
-#include <stdarg.h>
 #include "../../tree-vect.h"
 
 #define N 16 
 
 unsigned int out[N];
-unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+const unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
 
-__attribute__ ((noinline)) int
-main1 (unsigned int x, unsigned int y)
+__attribute__ ((noinline)) void
+copy_to_unaligned (void)
 {
-  int i;
-  unsigned int *pin = &in[1];
+  const unsigned int *pin = &in[1];
   unsigned int *pout = &out[0];
-  unsigned int a0, a1, a2, a3;
 
   /* Misaligned load.  */
   *pout++ = *pin++;
   *pout++ = *pin++;
   *pout++ = *pin++;
   *pout++ = *pin++;
+}
 
-  /* Check results.  */
-  if (out[0] != in[1]
-      || out[1] != in[2]
-      || out[2] != in[3]
-      || out[3] != in[4])
-    abort();
+__attribute__ ((noinline)) void
+copy_from_unaligned (void)
+{
+  const unsigned int *pin = &in[0];
+  unsigned int *pout = &out[1];
 
-  return 0;
+  /* Misaligned load.  */
+  *pout++ = *pin++;
+  *pout++ = *pin++;
+  *pout++ = *pin++;
+  *pout++ = *pin++;
 }
 
 int main (void)
 {
   check_vect ();
 
-  main1 (2, 3);
+  copy_to_unaligned ();
+
+  /* Check results outside of main1 where it would likely
+     be optimized away.  */
+  if (out[0] != in[1]
+      || out[1] != in[2]
+      || out[2] != in[3]
+      || out[3] != in[4])
+    abort();
+
+  copy_from_unaligned ();
+
+  if (out[1] != in[0]
+      || out[2] != in[1]
+      || out[3] != in[2]
+      || out[4] != in[3])
+    abort();
 
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp2"  { xfail  vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 2 "slp2"  { xfail  vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "slp2" } } */
   
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c
new file mode 100644
index 0000000..b94cf0e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c
@@ -0,0 +1,45 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+
+#define DEFINE_DATA(T) \
+    const T T ## _ ## src[] = { \
+         0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,\
+        16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32 }; \
+    T T ## _ ## dst[sizeof T ## _ ## src / sizeof (T)]
+
+#define DEFINE_COPY_M_N(T, srcoff, dstoff) \
+void copy_ ## T ## _ ## srcoff ## _ ## dstoff (void) { \
+    const T *s = T ## _ ## src + srcoff; \
+    T *d = T ## _ ## dst + dstoff; \
+    unsigned i; \
+    for (i = 0; i != 16 / sizeof *s; ++i) \
+        *d++ = *s++; \
+}
+
+#define DEFINE_COPY_M(T, M) \
+    DEFINE_COPY_M_N (T, M, 0); \
+    DEFINE_COPY_M_N (T, M, 1); \
+    DEFINE_COPY_M_N (T, M, 2); \
+    DEFINE_COPY_M_N (T, M, 3)
+
+#define TEST_COPY(T) \
+    DEFINE_DATA (T); \
+    DEFINE_COPY_M (T, 0); \
+    DEFINE_COPY_M (T, 1); \
+    DEFINE_COPY_M (T, 2); \
+    DEFINE_COPY_M (T, 3)
+
+#ifndef Type
+#  define Type char
+#endif
+
+TEST_COPY (Type);
+
+/* Verify that the assembly contains vector instructions alone
+   with no byte loads (lb, lbu, lbz, lbzu, or their indexed forms)
+   or byte stores (stb, stbu, stbx, stbux, or their indexed
+   forms).  */
+
+/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
+/* { dg-final { scan-assembler-not "\tlbz?u?x? " { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not "\tstbu?x? " { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-dword.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-dword.c
new file mode 100644
index 0000000..87d5cb6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-dword.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+
+#define Type long
+#include "costmodel-bb-slp-pr63175-base.c"
+
+/* Verify that the assembly contains vector instructions alone
+   with no doubleword loads (ld, ldu, or their indexed forms)
+   or stores (std, stdu, or their indexed forms).  */
+
+/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
+/* { dg-final { scan-assembler-not "\tldu?x? " } } */
+/* { dg-final { scan-assembler-not "\tstdu?x? " } } */
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-hword.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-hword.c
new file mode 100644
index 0000000..8c22294
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-hword.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+
+#define Type short
+#include "costmodel-bb-slp-pr63175-base.c"
+
+/* Verify that the assembly contains vector instructions alone
+   with no halfword loads (lh, lhz, lhzu or their indexed forms)
+   or halfword stores (sth, sthu, or their indexed forms).  */
+
+/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
+/* { dg-final { scan-assembler-not "\tlhz?u?x? " { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not "\tsthu?x? " { xfail *-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-word.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-word.c
new file mode 100644
index 0000000..942b8a6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-word.c
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+
+#define Type int
+#include "costmodel-bb-slp-pr63175-base.c"
+
+/* Verify that the assembly contains vector instructions alone
+   with no word loads (lw, lwz, lwzu or their indexed forms
+   or word stores (stw or stwu, or their indexed forms).  */
+
+/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
+/* { dg-final { scan-assembler-not "\tlwz?u?x? " } } */
+/* { dg-final { scan-assembler-not "\tstwu?x? " } } */

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-03-02 16:28             ` Martin Sebor
@ 2015-03-05 23:44               ` Martin Sebor
  2015-03-06  8:45                 ` Richard Biener
  2015-03-06 17:28               ` Jeff Law
  1 sibling, 1 reply; 19+ messages in thread
From: Martin Sebor @ 2015-03-05 23:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: David Edelsohn, Jeff Law, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 462 bytes --]

Attached is a scaled down version of the test for the bug.
It fixes the scan-tree-dump-times string to match what GCC
5 prints and moves the result checking out of the test
function and into main to prevent it from getting optimized
away (as observed in comment #8 on the bug).

The patch also adds a regression test for the bug to scan
the assembly for the absence of ordinary loads and stores.

Tested on ppc64le-linux.

Does it look okay to everyone?

Martin

[-- Attachment #2: gcc-63175.patch --]
[-- Type: text/x-patch, Size: 3275 bytes --]

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 2e77ba4..27f41fd 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,11 @@
+2015-03-05  Martin Sebor  <msebor@redhat.com>
+
+	* PR testsuite/63175
+	* gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c (main1): Move
+	checking of results into main to prevent it from getting optimized
+	away.
+	* gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c: New test.
+
 2015-03-04  Ian Lance Taylor  <iant@google.com>
 
 	* go.test/go-test.exp (go-gc-tests): Skip nilptr test on s390*.
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c
new file mode 100644
index 0000000..73c0afa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c
@@ -0,0 +1,30 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+
+#define N 16 
+
+const unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+unsigned int out[N];
+
+__attribute__ ((noinline)) int
+main1 (void)
+{
+  const unsigned int *pin = &in[1];
+  unsigned int *pout = &out[0];
+
+  /* Misaligned load.  */
+  *pout++ = *pin++;
+  *pout++ = *pin++;
+  *pout++ = *pin++;
+  *pout++ = *pin++;
+
+  return 0;
+}
+
+/* Verify that the assembly contains vector instructions alone
+   with no word loads (lw, lwu, lwz, lwzu, or their indexed forms)
+   or word stores (stw, stwu, stwx, stwux, or their indexed forms).  */
+
+/* { dg-final { scan-assembler "\t\(lxv|lvsr|stxv\)" } } */
+/* { dg-final { scan-assembler-not "\tlwz?u?x? " } } */
+/* { dg-final { scan-assembler-not "\tstwu?x? " } } */
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
index e1bc1a8..45046f4 100644
--- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
@@ -1,6 +1,5 @@
 /* { dg-require-effective-target vect_int } */
 
-#include <stdarg.h>
 #include "../../tree-vect.h"
 
 #define N 16 
@@ -9,12 +8,10 @@ unsigned int out[N];
 unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
 
 __attribute__ ((noinline)) int
-main1 (unsigned int x, unsigned int y)
+main1 (void)
 {
-  int i;
   unsigned int *pin = &in[1];
   unsigned int *pout = &out[0];
-  unsigned int a0, a1, a2, a3;
 
   /* Misaligned load.  */
   *pout++ = *pin++;
@@ -22,13 +19,6 @@ main1 (unsigned int x, unsigned int y)
   *pout++ = *pin++;
   *pout++ = *pin++;
 
-  /* Check results.  */
-  if (out[0] != in[1]
-      || out[1] != in[2]
-      || out[2] != in[3]
-      || out[3] != in[4])
-    abort();
-
   return 0;
 }
 
@@ -36,11 +26,17 @@ int main (void)
 {
   check_vect ();
 
-  main1 (2, 3);
+  main1 ();
+
+  /* Check results.  */
+  if (out[0] != in[1]
+      || out[1] != in[2]
+      || out[2] != in[3]
+      || out[3] != in[4])
+    abort();
 
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp2"  { xfail  vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp2"  { xfail  vect_no_align } } } */
 /* { dg-final { cleanup-tree-dump "slp2" } } */
-  

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-03-05 23:44               ` Martin Sebor
@ 2015-03-06  8:45                 ` Richard Biener
  0 siblings, 0 replies; 19+ messages in thread
From: Richard Biener @ 2015-03-06  8:45 UTC (permalink / raw)
  To: Martin Sebor; +Cc: David Edelsohn, Jeff Law, GCC Patches

On Thu, 5 Mar 2015, Martin Sebor wrote:

> Attached is a scaled down version of the test for the bug.
> It fixes the scan-tree-dump-times string to match what GCC
> 5 prints and moves the result checking out of the test
> function and into main to prevent it from getting optimized
> away (as observed in comment #8 on the bug).
> 
> The patch also adds a regression test for the bug to scan
> the assembly for the absence of ordinary loads and stores.
> 
> Tested on ppc64le-linux.
> 
> Does it look okay to everyone?

Looks ok to me.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-03-02 16:28             ` Martin Sebor
  2015-03-05 23:44               ` Martin Sebor
@ 2015-03-06 17:28               ` Jeff Law
  2015-03-06 20:22                 ` Martin Sebor
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff Law @ 2015-03-06 17:28 UTC (permalink / raw)
  To: Martin Sebor, Richard Biener; +Cc: David Edelsohn, GCC Patches

On 03/02/15 09:28, Martin Sebor wrote:
> On 03/02/2015 06:58 AM, Richard Biener wrote:
>> On Fri, 27 Feb 2015, Martin Sebor wrote:
>>
>>>>> Given that Martin's fix to the testcase allowed it to succeed without
>>>>> Richi's fix for the underlying problem, is there a modification to the
>>>>> testcase or a new testcase that would really test the optimization?
>>>>
>>>> Let me work on it.
>>>
>>> Below is a patch with a couple of minor tweaks to the existing
>>> test first to update the search string and second to better
>>> exercise the vectorization not only when the source address
>>> isn't aligned on the expected boundary but also when the
>>> destination address isn't.  This enhancement revealed
>>> an outstanding aspect of the regression (not fixed by Richard's
>>> already committed patch).
>>>
>>> Besides this change, the patch also adds a number of other
>>> tests to better exercise the vectorization by verifying it
>>> takes place for arrays of elements of other sizes besides
>>> word: byte, half word, and double word.  Those tests reveal
>>> both another regression WRT 4.8 and further vectorization
>>> opportunities not exploited even in 4.8.  I marked the latter
>>> XFAIL in the tests so that when the regression is fully
>>> resolved, the tests should pass with no unexpected failures.
>>
>> I have a hard time applying the patch because of line-wrapping issues
>> or my patch tool not groking the git diffs.
>>
>> Can you please either commit the patch or extract the testcase
>> that still regresses and paste it into PR63175?
>
> I pasted a couple of such test cases to the bug. The full patch
> is also attached to this email in case there was a problem with
> line wrapping.
So for the unaligned case, is that really a regression when compared to 
earlier compilers?   If not, then it seems that we ought to at least be 
at a point where the regression marker for that BZ can be removed, 
right?  ie, Richi's patch fixed the actual code quality regression and 
your patch fixes the testsuite aspects, right?


jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-03-06 17:28               ` Jeff Law
@ 2015-03-06 20:22                 ` Martin Sebor
  2015-03-07  8:34                   ` Richard Biener
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Sebor @ 2015-03-06 20:22 UTC (permalink / raw)
  To: Jeff Law, Richard Biener; +Cc: David Edelsohn, GCC Patches

On 03/06/2015 10:28 AM, Jeff Law wrote:
> On 03/02/15 09:28, Martin Sebor wrote:
>> On 03/02/2015 06:58 AM, Richard Biener wrote:
>>> On Fri, 27 Feb 2015, Martin Sebor wrote:
>>>
>>>>>> Given that Martin's fix to the testcase allowed it to succeed without
>>>>>> Richi's fix for the underlying problem, is there a modification to
>>>>>> the
>>>>>> testcase or a new testcase that would really test the optimization?
>>>>>
>>>>> Let me work on it.
>>>>
>>>> Below is a patch with a couple of minor tweaks to the existing
>>>> test first to update the search string and second to better
>>>> exercise the vectorization not only when the source address
>>>> isn't aligned on the expected boundary but also when the
>>>> destination address isn't.  This enhancement revealed
>>>> an outstanding aspect of the regression (not fixed by Richard's
>>>> already committed patch).
>>>>
>>>> Besides this change, the patch also adds a number of other
>>>> tests to better exercise the vectorization by verifying it
>>>> takes place for arrays of elements of other sizes besides
>>>> word: byte, half word, and double word.  Those tests reveal
>>>> both another regression WRT 4.8 and further vectorization
>>>> opportunities not exploited even in 4.8.  I marked the latter
>>>> XFAIL in the tests so that when the regression is fully
>>>> resolved, the tests should pass with no unexpected failures.
>>>
>>> I have a hard time applying the patch because of line-wrapping issues
>>> or my patch tool not groking the git diffs.
>>>
>>> Can you please either commit the patch or extract the testcase
>>> that still regresses and paste it into PR63175?
>>
>> I pasted a couple of such test cases to the bug. The full patch
>> is also attached to this email in case there was a problem with
>> line wrapping.
> So for the unaligned case, is that really a regression when compared to
> earlier compilers?   If not, then it seems that we ought to at least be
> at a point where the regression marker for that BZ can be removed,
> right?  ie, Richi's patch fixed the actual code quality regression and
> your patch fixes the testsuite aspects, right?

My interpretation of the bug report is that it points out
two problems:

1) a failure in the costmodel-bb-slp-9a.c test
2) a quality regression observed by inspecting the assembly
    emitted for the test

The two are unrelated in that (2) didn't cause (1).

Since Richi's patch fixed (2) and my latest patch fixes (1)
I would be inclined to consider the bug resolved.

While GCC 5 doesn't vectorize some code that 4.8 does with
the same options, it's apparently by accident (or due to
a bug in 4.8).  Since 5.0 does vectorize the same code when
the right set of options is specified, I agree with others
that none of my additional tests has exposed any other
regressions than the one that's already been addressed.

Martin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-03-06 20:22                 ` Martin Sebor
@ 2015-03-07  8:34                   ` Richard Biener
  2015-03-07 16:20                     ` Jeff Law
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Biener @ 2015-03-07  8:34 UTC (permalink / raw)
  To: Martin Sebor, Jeff Law; +Cc: David Edelsohn, GCC Patches

On March 6, 2015 9:22:05 PM CET, Martin Sebor <msebor@redhat.com> wrote:
>On 03/06/2015 10:28 AM, Jeff Law wrote:
>> On 03/02/15 09:28, Martin Sebor wrote:
>>> On 03/02/2015 06:58 AM, Richard Biener wrote:
>>>> On Fri, 27 Feb 2015, Martin Sebor wrote:
>>>>
>>>>>>> Given that Martin's fix to the testcase allowed it to succeed
>without
>>>>>>> Richi's fix for the underlying problem, is there a modification
>to
>>>>>>> the
>>>>>>> testcase or a new testcase that would really test the
>optimization?
>>>>>>
>>>>>> Let me work on it.
>>>>>
>>>>> Below is a patch with a couple of minor tweaks to the existing
>>>>> test first to update the search string and second to better
>>>>> exercise the vectorization not only when the source address
>>>>> isn't aligned on the expected boundary but also when the
>>>>> destination address isn't.  This enhancement revealed
>>>>> an outstanding aspect of the regression (not fixed by Richard's
>>>>> already committed patch).
>>>>>
>>>>> Besides this change, the patch also adds a number of other
>>>>> tests to better exercise the vectorization by verifying it
>>>>> takes place for arrays of elements of other sizes besides
>>>>> word: byte, half word, and double word.  Those tests reveal
>>>>> both another regression WRT 4.8 and further vectorization
>>>>> opportunities not exploited even in 4.8.  I marked the latter
>>>>> XFAIL in the tests so that when the regression is fully
>>>>> resolved, the tests should pass with no unexpected failures.
>>>>
>>>> I have a hard time applying the patch because of line-wrapping
>issues
>>>> or my patch tool not groking the git diffs.
>>>>
>>>> Can you please either commit the patch or extract the testcase
>>>> that still regresses and paste it into PR63175?
>>>
>>> I pasted a couple of such test cases to the bug. The full patch
>>> is also attached to this email in case there was a problem with
>>> line wrapping.
>> So for the unaligned case, is that really a regression when compared
>to
>> earlier compilers?   If not, then it seems that we ought to at least
>be
>> at a point where the regression marker for that BZ can be removed,
>> right?  ie, Richi's patch fixed the actual code quality regression
>and
>> your patch fixes the testsuite aspects, right?
>
>My interpretation of the bug report is that it points out
>two problems:
>
>1) a failure in the costmodel-bb-slp-9a.c test
>2) a quality regression observed by inspecting the assembly
>    emitted for the test
>
>The two are unrelated in that (2) didn't cause (1).
>
>Since Richi's patch fixed (2) and my latest patch fixes (1)
>I would be inclined to consider the bug resolved.
>
>While GCC 5 doesn't vectorize some code that 4.8 does with
>the same options, it's apparently by accident (or due to
>a bug in 4.8).  Since 5.0 does vectorize the same code when
>the right set of options is specified, I agree with others
>that none of my additional tests has exposed any other
>regressions than the one that's already been addressed.

Yes. Once the test cases have been fixed we should close the bug as fixed.

Richard.

>Martin


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-03-07  8:34                   ` Richard Biener
@ 2015-03-07 16:20                     ` Jeff Law
  2015-03-08  9:14                       ` Richard Biener
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff Law @ 2015-03-07 16:20 UTC (permalink / raw)
  To: Richard Biener, Martin Sebor; +Cc: David Edelsohn, GCC Patches

On 03/07/15 01:34, Richard Biener wrote:
> On March 6, 2015 9:22:05 PM CET, Martin Sebor <msebor@redhat.com> wrote:
>> On 03/06/2015 10:28 AM, Jeff Law wrote:
>>> On 03/02/15 09:28, Martin Sebor wrote:
>>>> On 03/02/2015 06:58 AM, Richard Biener wrote:
>>>>> On Fri, 27 Feb 2015, Martin Sebor wrote:
>>>>>
>>>>>>>> Given that Martin's fix to the testcase allowed it to succeed
>> without
>>>>>>>> Richi's fix for the underlying problem, is there a modification
>> to
>>>>>>>> the
>>>>>>>> testcase or a new testcase that would really test the
>> optimization?
>>>>>>>
>>>>>>> Let me work on it.
>>>>>>
>>>>>> Below is a patch with a couple of minor tweaks to the existing
>>>>>> test first to update the search string and second to better
>>>>>> exercise the vectorization not only when the source address
>>>>>> isn't aligned on the expected boundary but also when the
>>>>>> destination address isn't.  This enhancement revealed
>>>>>> an outstanding aspect of the regression (not fixed by Richard's
>>>>>> already committed patch).
>>>>>>
>>>>>> Besides this change, the patch also adds a number of other
>>>>>> tests to better exercise the vectorization by verifying it
>>>>>> takes place for arrays of elements of other sizes besides
>>>>>> word: byte, half word, and double word.  Those tests reveal
>>>>>> both another regression WRT 4.8 and further vectorization
>>>>>> opportunities not exploited even in 4.8.  I marked the latter
>>>>>> XFAIL in the tests so that when the regression is fully
>>>>>> resolved, the tests should pass with no unexpected failures.
>>>>>
>>>>> I have a hard time applying the patch because of line-wrapping
>> issues
>>>>> or my patch tool not groking the git diffs.
>>>>>
>>>>> Can you please either commit the patch or extract the testcase
>>>>> that still regresses and paste it into PR63175?
>>>>
>>>> I pasted a couple of such test cases to the bug. The full patch
>>>> is also attached to this email in case there was a problem with
>>>> line wrapping.
>>> So for the unaligned case, is that really a regression when compared
>> to
>>> earlier compilers?   If not, then it seems that we ought to at least
>> be
>>> at a point where the regression marker for that BZ can be removed,
>>> right?  ie, Richi's patch fixed the actual code quality regression
>> and
>>> your patch fixes the testsuite aspects, right?
>>
>> My interpretation of the bug report is that it points out
>> two problems:
>>
>> 1) a failure in the costmodel-bb-slp-9a.c test
>> 2) a quality regression observed by inspecting the assembly
>>     emitted for the test
>>
>> The two are unrelated in that (2) didn't cause (1).
>>
>> Since Richi's patch fixed (2) and my latest patch fixes (1)
>> I would be inclined to consider the bug resolved.
>>
>> While GCC 5 doesn't vectorize some code that 4.8 does with
>> the same options, it's apparently by accident (or due to
>> a bug in 4.8).  Since 5.0 does vectorize the same code when
>> the right set of options is specified, I agree with others
>> that none of my additional tests has exposed any other
>> regressions than the one that's already been addressed.
>
> Yes. Once the test cases have been fixed we should close the bug as fixed.
Trunk regression marker removed.  Not sure if it is worth backporting to 
4.9, but I left the 4.9 regression marker and the BZ open just in case.

jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-03-07 16:20                     ` Jeff Law
@ 2015-03-08  9:14                       ` Richard Biener
  2015-03-09 18:52                         ` Jeff Law
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Biener @ 2015-03-08  9:14 UTC (permalink / raw)
  To: Jeff Law, Martin Sebor; +Cc: David Edelsohn, GCC Patches

On March 7, 2015 5:20:08 PM CET, Jeff Law <law@redhat.com> wrote:
>On 03/07/15 01:34, Richard Biener wrote:
>> On March 6, 2015 9:22:05 PM CET, Martin Sebor <msebor@redhat.com>
>wrote:
>>> On 03/06/2015 10:28 AM, Jeff Law wrote:
>>>> On 03/02/15 09:28, Martin Sebor wrote:
>>>>> On 03/02/2015 06:58 AM, Richard Biener wrote:
>>>>>> On Fri, 27 Feb 2015, Martin Sebor wrote:
>>>>>>
>>>>>>>>> Given that Martin's fix to the testcase allowed it to succeed
>>> without
>>>>>>>>> Richi's fix for the underlying problem, is there a
>modification
>>> to
>>>>>>>>> the
>>>>>>>>> testcase or a new testcase that would really test the
>>> optimization?
>>>>>>>>
>>>>>>>> Let me work on it.
>>>>>>>
>>>>>>> Below is a patch with a couple of minor tweaks to the existing
>>>>>>> test first to update the search string and second to better
>>>>>>> exercise the vectorization not only when the source address
>>>>>>> isn't aligned on the expected boundary but also when the
>>>>>>> destination address isn't.  This enhancement revealed
>>>>>>> an outstanding aspect of the regression (not fixed by Richard's
>>>>>>> already committed patch).
>>>>>>>
>>>>>>> Besides this change, the patch also adds a number of other
>>>>>>> tests to better exercise the vectorization by verifying it
>>>>>>> takes place for arrays of elements of other sizes besides
>>>>>>> word: byte, half word, and double word.  Those tests reveal
>>>>>>> both another regression WRT 4.8 and further vectorization
>>>>>>> opportunities not exploited even in 4.8.  I marked the latter
>>>>>>> XFAIL in the tests so that when the regression is fully
>>>>>>> resolved, the tests should pass with no unexpected failures.
>>>>>>
>>>>>> I have a hard time applying the patch because of line-wrapping
>>> issues
>>>>>> or my patch tool not groking the git diffs.
>>>>>>
>>>>>> Can you please either commit the patch or extract the testcase
>>>>>> that still regresses and paste it into PR63175?
>>>>>
>>>>> I pasted a couple of such test cases to the bug. The full patch
>>>>> is also attached to this email in case there was a problem with
>>>>> line wrapping.
>>>> So for the unaligned case, is that really a regression when
>compared
>>> to
>>>> earlier compilers?   If not, then it seems that we ought to at
>least
>>> be
>>>> at a point where the regression marker for that BZ can be removed,
>>>> right?  ie, Richi's patch fixed the actual code quality regression
>>> and
>>>> your patch fixes the testsuite aspects, right?
>>>
>>> My interpretation of the bug report is that it points out
>>> two problems:
>>>
>>> 1) a failure in the costmodel-bb-slp-9a.c test
>>> 2) a quality regression observed by inspecting the assembly
>>>     emitted for the test
>>>
>>> The two are unrelated in that (2) didn't cause (1).
>>>
>>> Since Richi's patch fixed (2) and my latest patch fixes (1)
>>> I would be inclined to consider the bug resolved.
>>>
>>> While GCC 5 doesn't vectorize some code that 4.8 does with
>>> the same options, it's apparently by accident (or due to
>>> a bug in 4.8).  Since 5.0 does vectorize the same code when
>>> the right set of options is specified, I agree with others
>>> that none of my additional tests has exposed any other
>>> regressions than the one that's already been addressed.
>>
>> Yes. Once the test cases have been fixed we should close the bug as
>fixed.
>Trunk regression marker removed.  Not sure if it is worth backporting
>to 
>4.9, but I left the 4.9 regression marker and the BZ open just in case.

I backported the fix to the 4.9 branch already, so it would be nice to get the test cases fixes there as well.

Richard.

>jeff


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-03-08  9:14                       ` Richard Biener
@ 2015-03-09 18:52                         ` Jeff Law
  2015-03-09 20:18                           ` Martin Sebor
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff Law @ 2015-03-09 18:52 UTC (permalink / raw)
  To: Richard Biener, Martin Sebor; +Cc: David Edelsohn, GCC Patches

On 03/08/15 03:14, Richard Biener wrote:
> On March 7, 2015 5:20:08 PM CET, Jeff Law <law@redhat.com> wrote:
>> On 03/07/15 01:34, Richard Biener wrote:
>>> On March 6, 2015 9:22:05 PM CET, Martin Sebor <msebor@redhat.com>
>> wrote:
>>>> On 03/06/2015 10:28 AM, Jeff Law wrote:
>>>>> On 03/02/15 09:28, Martin Sebor wrote:
>>>>>> On 03/02/2015 06:58 AM, Richard Biener wrote:
>>>>>>> On Fri, 27 Feb 2015, Martin Sebor wrote:
>>>>>>>
>>>>>>>>>> Given that Martin's fix to the testcase allowed it to succeed
>>>> without
>>>>>>>>>> Richi's fix for the underlying problem, is there a
>> modification
>>>> to
>>>>>>>>>> the
>>>>>>>>>> testcase or a new testcase that would really test the
>>>> optimization?
>>>>>>>>>
>>>>>>>>> Let me work on it.
>>>>>>>>
>>>>>>>> Below is a patch with a couple of minor tweaks to the existing
>>>>>>>> test first to update the search string and second to better
>>>>>>>> exercise the vectorization not only when the source address
>>>>>>>> isn't aligned on the expected boundary but also when the
>>>>>>>> destination address isn't.  This enhancement revealed
>>>>>>>> an outstanding aspect of the regression (not fixed by Richard's
>>>>>>>> already committed patch).
>>>>>>>>
>>>>>>>> Besides this change, the patch also adds a number of other
>>>>>>>> tests to better exercise the vectorization by verifying it
>>>>>>>> takes place for arrays of elements of other sizes besides
>>>>>>>> word: byte, half word, and double word.  Those tests reveal
>>>>>>>> both another regression WRT 4.8 and further vectorization
>>>>>>>> opportunities not exploited even in 4.8.  I marked the latter
>>>>>>>> XFAIL in the tests so that when the regression is fully
>>>>>>>> resolved, the tests should pass with no unexpected failures.
>>>>>>>
>>>>>>> I have a hard time applying the patch because of line-wrapping
>>>> issues
>>>>>>> or my patch tool not groking the git diffs.
>>>>>>>
>>>>>>> Can you please either commit the patch or extract the testcase
>>>>>>> that still regresses and paste it into PR63175?
>>>>>>
>>>>>> I pasted a couple of such test cases to the bug. The full patch
>>>>>> is also attached to this email in case there was a problem with
>>>>>> line wrapping.
>>>>> So for the unaligned case, is that really a regression when
>> compared
>>>> to
>>>>> earlier compilers?   If not, then it seems that we ought to at
>> least
>>>> be
>>>>> at a point where the regression marker for that BZ can be removed,
>>>>> right?  ie, Richi's patch fixed the actual code quality regression
>>>> and
>>>>> your patch fixes the testsuite aspects, right?
>>>>
>>>> My interpretation of the bug report is that it points out
>>>> two problems:
>>>>
>>>> 1) a failure in the costmodel-bb-slp-9a.c test
>>>> 2) a quality regression observed by inspecting the assembly
>>>>      emitted for the test
>>>>
>>>> The two are unrelated in that (2) didn't cause (1).
>>>>
>>>> Since Richi's patch fixed (2) and my latest patch fixes (1)
>>>> I would be inclined to consider the bug resolved.
>>>>
>>>> While GCC 5 doesn't vectorize some code that 4.8 does with
>>>> the same options, it's apparently by accident (or due to
>>>> a bug in 4.8).  Since 5.0 does vectorize the same code when
>>>> the right set of options is specified, I agree with others
>>>> that none of my additional tests has exposed any other
>>>> regressions than the one that's already been addressed.
>>>
>>> Yes. Once the test cases have been fixed we should close the bug as
>> fixed.
>> Trunk regression marker removed.  Not sure if it is worth backporting
>> to
>> 4.9, but I left the 4.9 regression marker and the BZ open just in case.
>
> I backported the fix to the 4.9 branch already, so it would be nice to get the test cases fixes there as well.
Martin -- that's your cue ;-)

jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
  2015-03-09 18:52                         ` Jeff Law
@ 2015-03-09 20:18                           ` Martin Sebor
  0 siblings, 0 replies; 19+ messages in thread
From: Martin Sebor @ 2015-03-09 20:18 UTC (permalink / raw)
  To: Jeff Law, Richard Biener; +Cc: David Edelsohn, GCC Patches

>> I backported the fix to the 4.9 branch already, so it would be nice to
>> get the test cases fixes there as well.
> Martin -- that's your cue ;-)

Sure. It's on my list of things to do.

Martin

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1
@ 2015-02-21 21:02 Martin Sebor
  0 siblings, 0 replies; 19+ messages in thread
From: Martin Sebor @ 2015-02-21 21:02 UTC (permalink / raw)
  To: Gcc Patch List

The trivial patch below fixes the failure in
gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c on ppc64 and ppc
noted in PR63175.

Martin

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 220801)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2015-02-21  Martin Sebor  <msebor@redhat.com>
+
+	PR testsuite/63175
+	* gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c: Correct
+	expected string.
+
  2015-02-18  Jakub Jelinek  <jakub@redhat.com>

  	PR gcov-profile/64634
Index: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
===================================================================
--- gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c	(revision 220801)
+++ gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c	(working copy)
@@ -41,6 +41,6 @@
    return 0;
  }

-/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 
1 "slp2"  { xfail  vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp2" 
  { xfail  vect_no_align } } } */
  /* { dg-final { cleanup-tree-dump "slp2" } } */

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-03-09 20:18 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-22 18:49 [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1 David Edelsohn
2015-02-23 16:08 ` Richard Biener
2015-02-24  7:22 ` Martin Sebor
2015-02-24 20:12   ` Jeff Law
2015-02-27 14:30     ` David Edelsohn
2015-02-27 15:59       ` Martin Sebor
2015-02-28  1:30         ` Martin Sebor
2015-03-02 13:58           ` Richard Biener
2015-03-02 16:28             ` Martin Sebor
2015-03-05 23:44               ` Martin Sebor
2015-03-06  8:45                 ` Richard Biener
2015-03-06 17:28               ` Jeff Law
2015-03-06 20:22                 ` Martin Sebor
2015-03-07  8:34                   ` Richard Biener
2015-03-07 16:20                     ` Jeff Law
2015-03-08  9:14                       ` Richard Biener
2015-03-09 18:52                         ` Jeff Law
2015-03-09 20:18                           ` Martin Sebor
  -- strict thread matches above, loose matches on Subject: below --
2015-02-21 21:02 Martin Sebor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).