public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH] Fix PR78189
@ 2017-01-20 16:35 Nick Clifton
  2017-01-23  9:05 ` Richard Biener
  0 siblings, 1 reply; 15+ messages in thread
From: Nick Clifton @ 2017-01-20 16:35 UTC (permalink / raw)
  To: christophe.lyon, rguenther, amker.cheng; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2102 bytes --]

Hi Guys,

  [I have been asked to look at this PR in the hopes that it can be
  fixed soon and so no longer act as a blocker for the gcc 7 branch].

  It seems to me that Richard's proposed patch does work:

https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00909.html

  The only problem is that the check_effective_target_vect_hw_misalign
  proc is always returning 0 (or false) for ARM, even when unaligned
  vectors are supported.  This is why Richard's patch introduces a new
  failure for the arm-* targets.

  So what I would like to suggest is an extended patch (attached) which
  also updates the check_effective_target_vect_hw_misalign proc to use
  the check_effective_target_arm_vect_no_misalign proc.  With this patch
  applied not only does the gcc.dg/vect/vect-strided-a-u8-i2-gap.c test
  for both big-endian and little-endian arm targets, but there is also a
  significant reduction in the number of failures in the gcc.dg/vect
  tests overall:

   Little Endian ARM:
< # of expected passes		3275
< # of unexpected failures	63
< # of unexpected successes	125
< # of expected failures	123
< # of unsupported tests	153
---
> # of expected passes		3448
> # of unexpected failures	2
> # of unexpected successes	14
> # of expected failures	131
> # of unsupported tests	151

  Big Endian ARM:
< # of expected passes		2995
< # of unexpected failures	269
< # of unexpected successes	21
< # of expected failures	128
---
> # of expected passes		3037
> # of unexpected failures	127
> # of unexpected successes	24
> # of expected failures	228

  Which looks like a win to me.  So - any objections to my applying this
  patch and then closing the PR ?

Cheers
  Nick

gcc/ChangeLog
2017-01-20  Richard Biener  <rguenther@suse.de>
	    Nick Clifton  <nickc@redhat.com>

	PR testsuite/78421
	* lib/target-supports.exp (check_effective_target_vect_hw_misalign):
	If the target is ARM return the result of the
	check_effective_target_arm_vect_no_misalign proc.
	* gcc.dg/vect/vect-strided-a-u8-i2-gap.c: If the target does not
	support unaligned vectors then only expect one of the loops to be
	unrolled.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: pr78421.patch --]
[-- Type: text/x-patch, Size: 1298 bytes --]

Index: gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c	(revision 244691)
+++ gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c	(working copy)
@@ -71,5 +71,6 @@
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target vect_strided2 } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target { vect_strided2 && { ! vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target { vect_strided2 && vect_hw_misalign } } } } */
   
Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	(revision 244691)
+++ gcc/testsuite/lib/target-supports.exp	(working copy)
@@ -5732,6 +5732,9 @@
 	     || ([istarget mips*-*-*] && [et-is-effective-target mips_msa]) } {
 	  set et_vect_hw_misalign_saved($et_index) 1
 	}
+	if { [istarget arm*-*-*] } {
+	    set et_vect_hw_misalign_saved($et_index) [check_effective_target_arm_vect_no_misalign]
+	}
     }
     verbose "check_effective_target_vect_hw_misalign:\
 	     returning $et_vect_hw_misalign_saved($et_index)" 2

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2017-01-20 16:35 [PATCH] Fix PR78189 Nick Clifton
@ 2017-01-23  9:05 ` Richard Biener
  2017-01-23 19:45   ` Christophe Lyon
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Biener @ 2017-01-23  9:05 UTC (permalink / raw)
  To: Nick Clifton; +Cc: christophe.lyon, amker.cheng, gcc-patches

On Fri, 20 Jan 2017, Nick Clifton wrote:

> Hi Guys,
> 
>   [I have been asked to look at this PR in the hopes that it can be
>   fixed soon and so no longer act as a blocker for the gcc 7 branch].
> 
>   It seems to me that Richard's proposed patch does work:
> 
> https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00909.html
> 
>   The only problem is that the check_effective_target_vect_hw_misalign
>   proc is always returning 0 (or false) for ARM, even when unaligned
>   vectors are supported.  This is why Richard's patch introduces a new
>   failure for the arm-* targets.
> 
>   So what I would like to suggest is an extended patch (attached) which
>   also updates the check_effective_target_vect_hw_misalign proc to use
>   the check_effective_target_arm_vect_no_misalign proc.  With this patch
>   applied not only does the gcc.dg/vect/vect-strided-a-u8-i2-gap.c test
>   for both big-endian and little-endian arm targets, but there is also a
>   significant reduction in the number of failures in the gcc.dg/vect
>   tests overall:
> 
>    Little Endian ARM:
> < # of expected passes		3275
> < # of unexpected failures	63
> < # of unexpected successes	125
> < # of expected failures	123
> < # of unsupported tests	153
> ---
> > # of expected passes		3448
> > # of unexpected failures	2
> > # of unexpected successes	14
> > # of expected failures	131
> > # of unsupported tests	151
> 
>   Big Endian ARM:
> < # of expected passes		2995
> < # of unexpected failures	269
> < # of unexpected successes	21
> < # of expected failures	128
> ---
> > # of expected passes		3037
> > # of unexpected failures	127
> > # of unexpected successes	24
> > # of expected failures	228
> 
>   Which looks like a win to me.  So - any objections to my applying this
>   patch and then closing the PR ?

Ok.

Thanks,
Richard.

> Cheers
>   Nick
> 
> gcc/ChangeLog
> 2017-01-20  Richard Biener  <rguenther@suse.de>
> 	    Nick Clifton  <nickc@redhat.com>
> 
> 	PR testsuite/78421
> 	* lib/target-supports.exp (check_effective_target_vect_hw_misalign):
> 	If the target is ARM return the result of the
> 	check_effective_target_arm_vect_no_misalign proc.
> 	* gcc.dg/vect/vect-strided-a-u8-i2-gap.c: If the target does not
> 	support unaligned vectors then only expect one of the loops to be
> 	unrolled.
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2017-01-23  9:05 ` Richard Biener
@ 2017-01-23 19:45   ` Christophe Lyon
  2017-01-25 10:28     ` Kyrill Tkachov
  0 siblings, 1 reply; 15+ messages in thread
From: Christophe Lyon @ 2017-01-23 19:45 UTC (permalink / raw)
  To: Richard Biener; +Cc: Nick Clifton, Bin.Cheng, gcc-patches

Hi Nick,

On 23 January 2017 at 10:04, Richard Biener <rguenther@suse.de> wrote:
> On Fri, 20 Jan 2017, Nick Clifton wrote:
>
>> Hi Guys,
>>
>>   [I have been asked to look at this PR in the hopes that it can be
>>   fixed soon and so no longer act as a blocker for the gcc 7 branch].
>>
>>   It seems to me that Richard's proposed patch does work:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00909.html
>>
>>   The only problem is that the check_effective_target_vect_hw_misalign
>>   proc is always returning 0 (or false) for ARM, even when unaligned
>>   vectors are supported.  This is why Richard's patch introduces a new
>>   failure for the arm-* targets.
>>
>>   So what I would like to suggest is an extended patch (attached) which
>>   also updates the check_effective_target_vect_hw_misalign proc to use
>>   the check_effective_target_arm_vect_no_misalign proc.  With this patch
>>   applied not only does the gcc.dg/vect/vect-strided-a-u8-i2-gap.c test
>>   for both big-endian and little-endian arm targets, but there is also a
>>   significant reduction in the number of failures in the gcc.dg/vect
>>   tests overall:
>>
>>    Little Endian ARM:
>> < # of expected passes                3275
>> < # of unexpected failures    63
>> < # of unexpected successes   125
>> < # of expected failures      123
>> < # of unsupported tests      153
>> ---
>> > # of expected passes                3448
>> > # of unexpected failures    2
>> > # of unexpected successes   14
>> > # of expected failures      131
>> > # of unsupported tests      151
>>
>>   Big Endian ARM:
>> < # of expected passes                2995
>> < # of unexpected failures    269
>> < # of unexpected successes   21
>> < # of expected failures      128
>> ---
>> > # of expected passes                3037
>> > # of unexpected failures    127
>> > # of unexpected successes   24
>> > # of expected failures      228
>>
>>   Which looks like a win to me.  So - any objections to my applying this
>>   patch and then closing the PR ?
>
> Ok.
>

I must be missing something, but I see many regressions since you
committed this patch (r244796).
See
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/244796/report-build-info.html
for more details.

In short, on arm-*,
  gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects
scan-tree-dump-times vect "vectorized 1 loops" 1
  gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect
"vectorized 1 loops" 1
now FAIL instead of PASS.

on armeb, there are many more differences.

Christophe


> Thanks,
> Richard.
>
>> Cheers
>>   Nick
>>
>> gcc/ChangeLog
>> 2017-01-20  Richard Biener  <rguenther@suse.de>
>>           Nick Clifton  <nickc@redhat.com>
>>
>>       PR testsuite/78421
>>       * lib/target-supports.exp (check_effective_target_vect_hw_misalign):
>>       If the target is ARM return the result of the
>>       check_effective_target_arm_vect_no_misalign proc.
>>       * gcc.dg/vect/vect-strided-a-u8-i2-gap.c: If the target does not
>>       support unaligned vectors then only expect one of the loops to be
>>       unrolled.
>>
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2017-01-23 19:45   ` Christophe Lyon
@ 2017-01-25 10:28     ` Kyrill Tkachov
  0 siblings, 0 replies; 15+ messages in thread
From: Kyrill Tkachov @ 2017-01-25 10:28 UTC (permalink / raw)
  To: Christophe Lyon, Richard Biener; +Cc: Nick Clifton, Bin.Cheng, gcc-patches


On 23/01/17 19:26, Christophe Lyon wrote:
> Hi Nick,
>
> On 23 January 2017 at 10:04, Richard Biener <rguenther@suse.de> wrote:
>> On Fri, 20 Jan 2017, Nick Clifton wrote:
>>
>>> Hi Guys,
>>>
>>>    [I have been asked to look at this PR in the hopes that it can be
>>>    fixed soon and so no longer act as a blocker for the gcc 7 branch].
>>>
>>>    It seems to me that Richard's proposed patch does work:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2016-11/msg00909.html
>>>
>>>    The only problem is that the check_effective_target_vect_hw_misalign
>>>    proc is always returning 0 (or false) for ARM, even when unaligned
>>>    vectors are supported.  This is why Richard's patch introduces a new
>>>    failure for the arm-* targets.
>>>
>>>    So what I would like to suggest is an extended patch (attached) which
>>>    also updates the check_effective_target_vect_hw_misalign proc to use
>>>    the check_effective_target_arm_vect_no_misalign proc.  With this patch
>>>    applied not only does the gcc.dg/vect/vect-strided-a-u8-i2-gap.c test
>>>    for both big-endian and little-endian arm targets, but there is also a
>>>    significant reduction in the number of failures in the gcc.dg/vect
>>>    tests overall:
>>>
>>>     Little Endian ARM:
>>> < # of expected passes                3275
>>> < # of unexpected failures    63
>>> < # of unexpected successes   125
>>> < # of expected failures      123
>>> < # of unsupported tests      153
>>> ---
>>>> # of expected passes                3448
>>>> # of unexpected failures    2
>>>> # of unexpected successes   14
>>>> # of expected failures      131
>>>> # of unsupported tests      151
>>>    Big Endian ARM:
>>> < # of expected passes                2995
>>> < # of unexpected failures    269
>>> < # of unexpected successes   21
>>> < # of expected failures      128
>>> ---
>>>> # of expected passes                3037
>>>> # of unexpected failures    127
>>>> # of unexpected successes   24
>>>> # of expected failures      228
>>>    Which looks like a win to me.  So - any objections to my applying this
>>>    patch and then closing the PR ?
>> Ok.
>>
> I must be missing something, but I see many regressions since you
> committed this patch (r244796).
> See
> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/244796/report-build-info.html
> for more details.
>
> In short, on arm-*,
>    gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorized 1 loops" 1
>    gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect
> "vectorized 1 loops" 1
> now FAIL instead of PASS.

I also see this when testing -mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard.

Thanks,
Kyrill

> on armeb, there are many more differences.
>
> Christophe
>
>
>> Thanks,
>> Richard.
>>
>>> Cheers
>>>    Nick
>>>
>>> gcc/ChangeLog
>>> 2017-01-20  Richard Biener  <rguenther@suse.de>
>>>            Nick Clifton  <nickc@redhat.com>
>>>
>>>        PR testsuite/78421
>>>        * lib/target-supports.exp (check_effective_target_vect_hw_misalign):
>>>        If the target is ARM return the result of the
>>>        check_effective_target_arm_vect_no_misalign proc.
>>>        * gcc.dg/vect/vect-strided-a-u8-i2-gap.c: If the target does not
>>>        support unaligned vectors then only expect one of the loops to be
>>>        unrolled.
>>>
>>>
>> --
>> Richard Biener <rguenther@suse.de>
>> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2016-11-11  8:56             ` Richard Biener
@ 2016-11-18 13:12               ` Christophe Lyon
  0 siblings, 0 replies; 15+ messages in thread
From: Christophe Lyon @ 2016-11-18 13:12 UTC (permalink / raw)
  To: Richard Biener; +Cc: Bin.Cheng, gcc-patches

On 11 November 2016 at 09:56, Richard Biener <rguenther@suse.de> wrote:
> On Thu, 10 Nov 2016, Christophe Lyon wrote:
>
>> On 10 November 2016 at 09:34, Richard Biener <rguenther@suse.de> wrote:
>> > On Wed, 9 Nov 2016, Christophe Lyon wrote:
>> >
>> >> On 9 November 2016 at 09:36, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> >> > On Tue, Nov 8, 2016 at 9:11 AM, Richard Biener <rguenther@suse.de> wrote:
>> >> >> On Mon, 7 Nov 2016, Christophe Lyon wrote:
>> >> >>
>> >> >>> Hi Richard,
>> >> >>>
>> >> >>>
>> >> >>> On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
>> >> >>> >
>> >> >>> > The following fixes an oversight when computing alignment in the
>> >> >>> > vectorizer.
>> >> >>> >
>> >> >>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
>> >> >>> >
>> >> >>> > Richard.
>> >> >>> >
>> >> >>> > 2016-11-07  Richard Biener  <rguenther@suse.de>
>> >> >>> >
>> >> >>> >         PR tree-optimization/78189
>> >> >>> >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
>> >> >>> >         alignment computation.
>> >> >>> >
>> >> >>> >         * g++.dg/torture/pr78189.C: New testcase.
>> >> >>> >
>> >> >>> > Index: gcc/testsuite/g++.dg/torture/pr78189.C
>> >> >>> > ===================================================================
>> >> >>> > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
>> >> >>> > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
>> >> >>> > @@ -0,0 +1,41 @@
>> >> >>> > +/* { dg-do run } */
>> >> >>> > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
>> >> >>> > +
>> >> >>> > +#include <cstddef>
>> >> >>> > +
>> >> >>> > +struct A
>> >> >>> > +{
>> >> >>> > +  void * a;
>> >> >>> > +  void * b;
>> >> >>> > +};
>> >> >>> > +
>> >> >>> > +struct alignas(16) B
>> >> >>> > +{
>> >> >>> > +  void * pad;
>> >> >>> > +  void * misaligned;
>> >> >>> > +  void * pad2;
>> >> >>> > +
>> >> >>> > +  A a;
>> >> >>> > +
>> >> >>> > +  void Null();
>> >> >>> > +};
>> >> >>> > +
>> >> >>> > +void B::Null()
>> >> >>> > +{
>> >> >>> > +  a.a = nullptr;
>> >> >>> > +  a.b = nullptr;
>> >> >>> > +}
>> >> >>> > +
>> >> >>> > +void __attribute__((noinline,noclone))
>> >> >>> > +NullB(void * misalignedPtr)
>> >> >>> > +{
>> >> >>> > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
>> >> >>> > +  b->Null();
>> >> >>> > +}
>> >> >>> > +
>> >> >>> > +int main()
>> >> >>> > +{
>> >> >>> > +  B b;
>> >> >>> > +  NullB(&b.misaligned);
>> >> >>> > +  return 0;
>> >> >>> > +}
>> >> >>> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>> >> >>> > index 9346cfe..b03cb1e 100644
>> >> >>> > --- gcc/tree-vect-data-refs.c
>> >> >>> > +++ gcc/tree-vect-data-refs.c
>> >> >>> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
>> >> >>> >    base = ref;
>> >> >>> >    while (handled_component_p (base))
>> >> >>> >      base = TREE_OPERAND (base, 0);
>> >> >>> > +  unsigned int base_alignment;
>> >> >>> > +  unsigned HOST_WIDE_INT base_bitpos;
>> >> >>> > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>> >> >>> > +  /* As data-ref analysis strips the MEM_REF down to its base operand
>> >> >>> > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>> >> >>> > +     adjust things to make base_alignment valid as the alignment of
>> >> >>> > +     DR_BASE_ADDRESS.  */
>> >> >>> >    if (TREE_CODE (base) == MEM_REF)
>> >> >>> > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
>> >> >>> > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
>> >> >>> > -  unsigned int base_alignment = get_object_alignment (base);
>> >> >>> > +    {
>> >> >>> > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>> >> >>> > +      base_bitpos &= (base_alignment - 1);
>> >> >>> > +    }
>> >> >>> > +  if (base_bitpos != 0)
>> >> >>> > +    base_alignment = base_bitpos & -base_bitpos;
>> >> >>> > +  /* Also look at the alignment of the base address DR analysis
>> >> >>> > +     computed.  */
>> >> >>> > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>> >> >>> > +  if (base_addr_alignment > base_alignment)
>> >> >>> > +    base_alignment = base_addr_alignment;
>> >> >>> >
>> >> >>> >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>> >> >>> >      DR_VECT_AUX (dr)->base_element_aligned = true;
>> >> >>>
>> >> >>> Since you committed this patch (r241892), I'm seeing execution failures:
>> >> >>>   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
>> >> >>>   gcc.dg/vect/pr40074.c execution test
>> >> >>> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
>> >> >>> --with-fpu=neon-fp16
>> >> >>> (using qemu as simulator)
>> >> >>
>> >> >> The difference is that we now vectorize the testcase with versioning
>> >> >> for alignment (but it should never execute the vectorized variant).
>> >> >> I need arm peoples help to understand what is wrong.
>> >> > Hi All,
>> >> > I will look at it.
>> >> >
>> >>
>> >> Hi,
>> >>
>> >> This is causing new regressions on armeb:
>> >>   gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects
>> >> scan-tree-dump-times vect "vectorized 2 loops" 1
>> >>   gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect
>> >> "vectorized 2 loops" 1
>> >
>> > It's actually an improvement as armeb can't do unaligned loads.
>> > Before the patch we versioned both loops for alignment (they
>> > can't be possibly both aligned by luck).  After the patch we
>> > align 'arr' and thus the first loop becomes known-aligned and
>> > vectorized while the 2nd one is now known unaligned (and thus
>> > not vectorized because we can't do unaligned accesses).
>> >
>> > I suppose the following will fix it.  Can you test that on
>> > some arm variants please?
>> >
>> > It works ok on x86_64/i?86 so please commit if it fixes the issue
>> > for you.
>> >
>>
>> Well, it works on armeb (the test now passes), but regresses on
>> arm-*   :(
>
> You have to help me here, it's not efficient time spent by me
> figuring out the right dg-targets here.  I can only suppose
> that maybe it uses load-lanes and that's happy with unaligned
> accesses but hw_misaling is still set to false.  Otherwise
> I can't see what's endian specific here (other than that
> -eb is probably still second class citizen in vectorization
> feature enablement?)
>
> Please open a bugreport against the testsuite in the meantime.
>

I understand, and I have just opened pr78421 to keep track of this.
Sorry for the delay.

Christophe

> Richard.
>
>> Christophe
>>
>> > Thanks,
>> > Richard.
>> >
>> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
>> > index 52fdcf6..3752600 100644
>> > --- a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
>> > +++ b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
>> > @@ -71,5 +71,5 @@ int main (void)
>> >    return 0;
>> >  }
>> >
>> > -/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target vect_strided2 } } } */
>> > -
>> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target { vect_strided2 && { ! vect_hw_misalign } } } } } */
>> > +/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target { vect_strided2 && vect_hw_misalign } } } } */
>>
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2016-11-10 14:17           ` Christophe Lyon
@ 2016-11-11  8:56             ` Richard Biener
  2016-11-18 13:12               ` Christophe Lyon
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Biener @ 2016-11-11  8:56 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: Bin.Cheng, gcc-patches

On Thu, 10 Nov 2016, Christophe Lyon wrote:

> On 10 November 2016 at 09:34, Richard Biener <rguenther@suse.de> wrote:
> > On Wed, 9 Nov 2016, Christophe Lyon wrote:
> >
> >> On 9 November 2016 at 09:36, Bin.Cheng <amker.cheng@gmail.com> wrote:
> >> > On Tue, Nov 8, 2016 at 9:11 AM, Richard Biener <rguenther@suse.de> wrote:
> >> >> On Mon, 7 Nov 2016, Christophe Lyon wrote:
> >> >>
> >> >>> Hi Richard,
> >> >>>
> >> >>>
> >> >>> On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
> >> >>> >
> >> >>> > The following fixes an oversight when computing alignment in the
> >> >>> > vectorizer.
> >> >>> >
> >> >>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> >> >>> >
> >> >>> > Richard.
> >> >>> >
> >> >>> > 2016-11-07  Richard Biener  <rguenther@suse.de>
> >> >>> >
> >> >>> >         PR tree-optimization/78189
> >> >>> >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
> >> >>> >         alignment computation.
> >> >>> >
> >> >>> >         * g++.dg/torture/pr78189.C: New testcase.
> >> >>> >
> >> >>> > Index: gcc/testsuite/g++.dg/torture/pr78189.C
> >> >>> > ===================================================================
> >> >>> > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
> >> >>> > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
> >> >>> > @@ -0,0 +1,41 @@
> >> >>> > +/* { dg-do run } */
> >> >>> > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
> >> >>> > +
> >> >>> > +#include <cstddef>
> >> >>> > +
> >> >>> > +struct A
> >> >>> > +{
> >> >>> > +  void * a;
> >> >>> > +  void * b;
> >> >>> > +};
> >> >>> > +
> >> >>> > +struct alignas(16) B
> >> >>> > +{
> >> >>> > +  void * pad;
> >> >>> > +  void * misaligned;
> >> >>> > +  void * pad2;
> >> >>> > +
> >> >>> > +  A a;
> >> >>> > +
> >> >>> > +  void Null();
> >> >>> > +};
> >> >>> > +
> >> >>> > +void B::Null()
> >> >>> > +{
> >> >>> > +  a.a = nullptr;
> >> >>> > +  a.b = nullptr;
> >> >>> > +}
> >> >>> > +
> >> >>> > +void __attribute__((noinline,noclone))
> >> >>> > +NullB(void * misalignedPtr)
> >> >>> > +{
> >> >>> > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
> >> >>> > +  b->Null();
> >> >>> > +}
> >> >>> > +
> >> >>> > +int main()
> >> >>> > +{
> >> >>> > +  B b;
> >> >>> > +  NullB(&b.misaligned);
> >> >>> > +  return 0;
> >> >>> > +}
> >> >>> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> >> >>> > index 9346cfe..b03cb1e 100644
> >> >>> > --- gcc/tree-vect-data-refs.c
> >> >>> > +++ gcc/tree-vect-data-refs.c
> >> >>> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
> >> >>> >    base = ref;
> >> >>> >    while (handled_component_p (base))
> >> >>> >      base = TREE_OPERAND (base, 0);
> >> >>> > +  unsigned int base_alignment;
> >> >>> > +  unsigned HOST_WIDE_INT base_bitpos;
> >> >>> > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
> >> >>> > +  /* As data-ref analysis strips the MEM_REF down to its base operand
> >> >>> > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> >> >>> > +     adjust things to make base_alignment valid as the alignment of
> >> >>> > +     DR_BASE_ADDRESS.  */
> >> >>> >    if (TREE_CODE (base) == MEM_REF)
> >> >>> > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
> >> >>> > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
> >> >>> > -  unsigned int base_alignment = get_object_alignment (base);
> >> >>> > +    {
> >> >>> > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
> >> >>> > +      base_bitpos &= (base_alignment - 1);
> >> >>> > +    }
> >> >>> > +  if (base_bitpos != 0)
> >> >>> > +    base_alignment = base_bitpos & -base_bitpos;
> >> >>> > +  /* Also look at the alignment of the base address DR analysis
> >> >>> > +     computed.  */
> >> >>> > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> >> >>> > +  if (base_addr_alignment > base_alignment)
> >> >>> > +    base_alignment = base_addr_alignment;
> >> >>> >
> >> >>> >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
> >> >>> >      DR_VECT_AUX (dr)->base_element_aligned = true;
> >> >>>
> >> >>> Since you committed this patch (r241892), I'm seeing execution failures:
> >> >>>   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
> >> >>>   gcc.dg/vect/pr40074.c execution test
> >> >>> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
> >> >>> --with-fpu=neon-fp16
> >> >>> (using qemu as simulator)
> >> >>
> >> >> The difference is that we now vectorize the testcase with versioning
> >> >> for alignment (but it should never execute the vectorized variant).
> >> >> I need arm peoples help to understand what is wrong.
> >> > Hi All,
> >> > I will look at it.
> >> >
> >>
> >> Hi,
> >>
> >> This is causing new regressions on armeb:
> >>   gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects
> >> scan-tree-dump-times vect "vectorized 2 loops" 1
> >>   gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect
> >> "vectorized 2 loops" 1
> >
> > It's actually an improvement as armeb can't do unaligned loads.
> > Before the patch we versioned both loops for alignment (they
> > can't be possibly both aligned by luck).  After the patch we
> > align 'arr' and thus the first loop becomes known-aligned and
> > vectorized while the 2nd one is now known unaligned (and thus
> > not vectorized because we can't do unaligned accesses).
> >
> > I suppose the following will fix it.  Can you test that on
> > some arm variants please?
> >
> > It works ok on x86_64/i?86 so please commit if it fixes the issue
> > for you.
> >
> 
> Well, it works on armeb (the test now passes), but regresses on
> arm-*   :(

You have to help me here, it's not efficient time spent by me
figuring out the right dg-targets here.  I can only suppose
that maybe it uses load-lanes and that's happy with unaligned
accesses but hw_misaling is still set to false.  Otherwise
I can't see what's endian specific here (other than that
-eb is probably still second class citizen in vectorization
feature enablement?)

Please open a bugreport against the testsuite in the meantime.

Richard.

> Christophe
> 
> > Thanks,
> > Richard.
> >
> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
> > index 52fdcf6..3752600 100644
> > --- a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
> > +++ b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
> > @@ -71,5 +71,5 @@ int main (void)
> >    return 0;
> >  }
> >
> > -/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target vect_strided2 } } } */
> > -
> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target { vect_strided2 && { ! vect_hw_misalign } } } } } */
> > +/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target { vect_strided2 && vect_hw_misalign } } } } */
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2016-11-10  8:34         ` Richard Biener
@ 2016-11-10 14:17           ` Christophe Lyon
  2016-11-11  8:56             ` Richard Biener
  0 siblings, 1 reply; 15+ messages in thread
From: Christophe Lyon @ 2016-11-10 14:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: Bin.Cheng, gcc-patches

On 10 November 2016 at 09:34, Richard Biener <rguenther@suse.de> wrote:
> On Wed, 9 Nov 2016, Christophe Lyon wrote:
>
>> On 9 November 2016 at 09:36, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> > On Tue, Nov 8, 2016 at 9:11 AM, Richard Biener <rguenther@suse.de> wrote:
>> >> On Mon, 7 Nov 2016, Christophe Lyon wrote:
>> >>
>> >>> Hi Richard,
>> >>>
>> >>>
>> >>> On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
>> >>> >
>> >>> > The following fixes an oversight when computing alignment in the
>> >>> > vectorizer.
>> >>> >
>> >>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
>> >>> >
>> >>> > Richard.
>> >>> >
>> >>> > 2016-11-07  Richard Biener  <rguenther@suse.de>
>> >>> >
>> >>> >         PR tree-optimization/78189
>> >>> >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
>> >>> >         alignment computation.
>> >>> >
>> >>> >         * g++.dg/torture/pr78189.C: New testcase.
>> >>> >
>> >>> > Index: gcc/testsuite/g++.dg/torture/pr78189.C
>> >>> > ===================================================================
>> >>> > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
>> >>> > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
>> >>> > @@ -0,0 +1,41 @@
>> >>> > +/* { dg-do run } */
>> >>> > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
>> >>> > +
>> >>> > +#include <cstddef>
>> >>> > +
>> >>> > +struct A
>> >>> > +{
>> >>> > +  void * a;
>> >>> > +  void * b;
>> >>> > +};
>> >>> > +
>> >>> > +struct alignas(16) B
>> >>> > +{
>> >>> > +  void * pad;
>> >>> > +  void * misaligned;
>> >>> > +  void * pad2;
>> >>> > +
>> >>> > +  A a;
>> >>> > +
>> >>> > +  void Null();
>> >>> > +};
>> >>> > +
>> >>> > +void B::Null()
>> >>> > +{
>> >>> > +  a.a = nullptr;
>> >>> > +  a.b = nullptr;
>> >>> > +}
>> >>> > +
>> >>> > +void __attribute__((noinline,noclone))
>> >>> > +NullB(void * misalignedPtr)
>> >>> > +{
>> >>> > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
>> >>> > +  b->Null();
>> >>> > +}
>> >>> > +
>> >>> > +int main()
>> >>> > +{
>> >>> > +  B b;
>> >>> > +  NullB(&b.misaligned);
>> >>> > +  return 0;
>> >>> > +}
>> >>> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>> >>> > index 9346cfe..b03cb1e 100644
>> >>> > --- gcc/tree-vect-data-refs.c
>> >>> > +++ gcc/tree-vect-data-refs.c
>> >>> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
>> >>> >    base = ref;
>> >>> >    while (handled_component_p (base))
>> >>> >      base = TREE_OPERAND (base, 0);
>> >>> > +  unsigned int base_alignment;
>> >>> > +  unsigned HOST_WIDE_INT base_bitpos;
>> >>> > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>> >>> > +  /* As data-ref analysis strips the MEM_REF down to its base operand
>> >>> > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>> >>> > +     adjust things to make base_alignment valid as the alignment of
>> >>> > +     DR_BASE_ADDRESS.  */
>> >>> >    if (TREE_CODE (base) == MEM_REF)
>> >>> > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
>> >>> > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
>> >>> > -  unsigned int base_alignment = get_object_alignment (base);
>> >>> > +    {
>> >>> > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>> >>> > +      base_bitpos &= (base_alignment - 1);
>> >>> > +    }
>> >>> > +  if (base_bitpos != 0)
>> >>> > +    base_alignment = base_bitpos & -base_bitpos;
>> >>> > +  /* Also look at the alignment of the base address DR analysis
>> >>> > +     computed.  */
>> >>> > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>> >>> > +  if (base_addr_alignment > base_alignment)
>> >>> > +    base_alignment = base_addr_alignment;
>> >>> >
>> >>> >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>> >>> >      DR_VECT_AUX (dr)->base_element_aligned = true;
>> >>>
>> >>> Since you committed this patch (r241892), I'm seeing execution failures:
>> >>>   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
>> >>>   gcc.dg/vect/pr40074.c execution test
>> >>> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
>> >>> --with-fpu=neon-fp16
>> >>> (using qemu as simulator)
>> >>
>> >> The difference is that we now vectorize the testcase with versioning
>> >> for alignment (but it should never execute the vectorized variant).
>> >> I need arm peoples help to understand what is wrong.
>> > Hi All,
>> > I will look at it.
>> >
>>
>> Hi,
>>
>> This is causing new regressions on armeb:
>>   gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects
>> scan-tree-dump-times vect "vectorized 2 loops" 1
>>   gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect
>> "vectorized 2 loops" 1
>
> It's actually an improvement as armeb can't do unaligned loads.
> Before the patch we versioned both loops for alignment (they
> can't be possibly both aligned by luck).  After the patch we
> align 'arr' and thus the first loop becomes known-aligned and
> vectorized while the 2nd one is now known unaligned (and thus
> not vectorized because we can't do unaligned accesses).
>
> I suppose the following will fix it.  Can you test that on
> some arm variants please?
>
> It works ok on x86_64/i?86 so please commit if it fixes the issue
> for you.
>

Well, it works on armeb (the test now passes), but regresses on
arm-*   :(

Christophe

> Thanks,
> Richard.
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
> index 52fdcf6..3752600 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
> @@ -71,5 +71,5 @@ int main (void)
>    return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target vect_strided2 } } } */
> -
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target { vect_strided2 && { ! vect_hw_misalign } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target { vect_strided2 && vect_hw_misalign } } } } */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2016-11-09 18:16       ` Christophe Lyon
@ 2016-11-10  8:34         ` Richard Biener
  2016-11-10 14:17           ` Christophe Lyon
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Biener @ 2016-11-10  8:34 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: Bin.Cheng, gcc-patches

On Wed, 9 Nov 2016, Christophe Lyon wrote:

> On 9 November 2016 at 09:36, Bin.Cheng <amker.cheng@gmail.com> wrote:
> > On Tue, Nov 8, 2016 at 9:11 AM, Richard Biener <rguenther@suse.de> wrote:
> >> On Mon, 7 Nov 2016, Christophe Lyon wrote:
> >>
> >>> Hi Richard,
> >>>
> >>>
> >>> On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
> >>> >
> >>> > The following fixes an oversight when computing alignment in the
> >>> > vectorizer.
> >>> >
> >>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> >>> >
> >>> > Richard.
> >>> >
> >>> > 2016-11-07  Richard Biener  <rguenther@suse.de>
> >>> >
> >>> >         PR tree-optimization/78189
> >>> >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
> >>> >         alignment computation.
> >>> >
> >>> >         * g++.dg/torture/pr78189.C: New testcase.
> >>> >
> >>> > Index: gcc/testsuite/g++.dg/torture/pr78189.C
> >>> > ===================================================================
> >>> > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
> >>> > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
> >>> > @@ -0,0 +1,41 @@
> >>> > +/* { dg-do run } */
> >>> > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
> >>> > +
> >>> > +#include <cstddef>
> >>> > +
> >>> > +struct A
> >>> > +{
> >>> > +  void * a;
> >>> > +  void * b;
> >>> > +};
> >>> > +
> >>> > +struct alignas(16) B
> >>> > +{
> >>> > +  void * pad;
> >>> > +  void * misaligned;
> >>> > +  void * pad2;
> >>> > +
> >>> > +  A a;
> >>> > +
> >>> > +  void Null();
> >>> > +};
> >>> > +
> >>> > +void B::Null()
> >>> > +{
> >>> > +  a.a = nullptr;
> >>> > +  a.b = nullptr;
> >>> > +}
> >>> > +
> >>> > +void __attribute__((noinline,noclone))
> >>> > +NullB(void * misalignedPtr)
> >>> > +{
> >>> > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
> >>> > +  b->Null();
> >>> > +}
> >>> > +
> >>> > +int main()
> >>> > +{
> >>> > +  B b;
> >>> > +  NullB(&b.misaligned);
> >>> > +  return 0;
> >>> > +}
> >>> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> >>> > index 9346cfe..b03cb1e 100644
> >>> > --- gcc/tree-vect-data-refs.c
> >>> > +++ gcc/tree-vect-data-refs.c
> >>> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
> >>> >    base = ref;
> >>> >    while (handled_component_p (base))
> >>> >      base = TREE_OPERAND (base, 0);
> >>> > +  unsigned int base_alignment;
> >>> > +  unsigned HOST_WIDE_INT base_bitpos;
> >>> > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
> >>> > +  /* As data-ref analysis strips the MEM_REF down to its base operand
> >>> > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> >>> > +     adjust things to make base_alignment valid as the alignment of
> >>> > +     DR_BASE_ADDRESS.  */
> >>> >    if (TREE_CODE (base) == MEM_REF)
> >>> > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
> >>> > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
> >>> > -  unsigned int base_alignment = get_object_alignment (base);
> >>> > +    {
> >>> > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
> >>> > +      base_bitpos &= (base_alignment - 1);
> >>> > +    }
> >>> > +  if (base_bitpos != 0)
> >>> > +    base_alignment = base_bitpos & -base_bitpos;
> >>> > +  /* Also look at the alignment of the base address DR analysis
> >>> > +     computed.  */
> >>> > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> >>> > +  if (base_addr_alignment > base_alignment)
> >>> > +    base_alignment = base_addr_alignment;
> >>> >
> >>> >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
> >>> >      DR_VECT_AUX (dr)->base_element_aligned = true;
> >>>
> >>> Since you committed this patch (r241892), I'm seeing execution failures:
> >>>   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
> >>>   gcc.dg/vect/pr40074.c execution test
> >>> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
> >>> --with-fpu=neon-fp16
> >>> (using qemu as simulator)
> >>
> >> The difference is that we now vectorize the testcase with versioning
> >> for alignment (but it should never execute the vectorized variant).
> >> I need arm peoples help to understand what is wrong.
> > Hi All,
> > I will look at it.
> >
> 
> Hi,
> 
> This is causing new regressions on armeb:
>   gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorized 2 loops" 1
>   gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect
> "vectorized 2 loops" 1

It's actually an improvement as armeb can't do unaligned loads.
Before the patch we versioned both loops for alignment (they
can't be possibly both aligned by luck).  After the patch we
align 'arr' and thus the first loop becomes known-aligned and
vectorized while the 2nd one is now known unaligned (and thus
not vectorized because we can't do unaligned accesses).

I suppose the following will fix it.  Can you test that on
some arm variants please?

It works ok on x86_64/i?86 so please commit if it fixes the issue
for you.

Thanks,
Richard.

diff --git a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
index 52fdcf6..3752600 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-strided-a-u8-i2-gap.c
@@ -71,5 +71,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target vect_strided2 } } } */
-  
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target { vect_strided2 && { ! vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect"  { target { vect_strided2 && vect_hw_misalign } } } } */

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2016-11-09  8:37     ` Bin.Cheng
@ 2016-11-09 18:16       ` Christophe Lyon
  2016-11-10  8:34         ` Richard Biener
  0 siblings, 1 reply; 15+ messages in thread
From: Christophe Lyon @ 2016-11-09 18:16 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Richard Biener, gcc-patches

On 9 November 2016 at 09:36, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Nov 8, 2016 at 9:11 AM, Richard Biener <rguenther@suse.de> wrote:
>> On Mon, 7 Nov 2016, Christophe Lyon wrote:
>>
>>> Hi Richard,
>>>
>>>
>>> On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
>>> >
>>> > The following fixes an oversight when computing alignment in the
>>> > vectorizer.
>>> >
>>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
>>> >
>>> > Richard.
>>> >
>>> > 2016-11-07  Richard Biener  <rguenther@suse.de>
>>> >
>>> >         PR tree-optimization/78189
>>> >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
>>> >         alignment computation.
>>> >
>>> >         * g++.dg/torture/pr78189.C: New testcase.
>>> >
>>> > Index: gcc/testsuite/g++.dg/torture/pr78189.C
>>> > ===================================================================
>>> > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
>>> > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
>>> > @@ -0,0 +1,41 @@
>>> > +/* { dg-do run } */
>>> > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
>>> > +
>>> > +#include <cstddef>
>>> > +
>>> > +struct A
>>> > +{
>>> > +  void * a;
>>> > +  void * b;
>>> > +};
>>> > +
>>> > +struct alignas(16) B
>>> > +{
>>> > +  void * pad;
>>> > +  void * misaligned;
>>> > +  void * pad2;
>>> > +
>>> > +  A a;
>>> > +
>>> > +  void Null();
>>> > +};
>>> > +
>>> > +void B::Null()
>>> > +{
>>> > +  a.a = nullptr;
>>> > +  a.b = nullptr;
>>> > +}
>>> > +
>>> > +void __attribute__((noinline,noclone))
>>> > +NullB(void * misalignedPtr)
>>> > +{
>>> > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
>>> > +  b->Null();
>>> > +}
>>> > +
>>> > +int main()
>>> > +{
>>> > +  B b;
>>> > +  NullB(&b.misaligned);
>>> > +  return 0;
>>> > +}
>>> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>>> > index 9346cfe..b03cb1e 100644
>>> > --- gcc/tree-vect-data-refs.c
>>> > +++ gcc/tree-vect-data-refs.c
>>> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
>>> >    base = ref;
>>> >    while (handled_component_p (base))
>>> >      base = TREE_OPERAND (base, 0);
>>> > +  unsigned int base_alignment;
>>> > +  unsigned HOST_WIDE_INT base_bitpos;
>>> > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>>> > +  /* As data-ref analysis strips the MEM_REF down to its base operand
>>> > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>>> > +     adjust things to make base_alignment valid as the alignment of
>>> > +     DR_BASE_ADDRESS.  */
>>> >    if (TREE_CODE (base) == MEM_REF)
>>> > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
>>> > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
>>> > -  unsigned int base_alignment = get_object_alignment (base);
>>> > +    {
>>> > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>>> > +      base_bitpos &= (base_alignment - 1);
>>> > +    }
>>> > +  if (base_bitpos != 0)
>>> > +    base_alignment = base_bitpos & -base_bitpos;
>>> > +  /* Also look at the alignment of the base address DR analysis
>>> > +     computed.  */
>>> > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>>> > +  if (base_addr_alignment > base_alignment)
>>> > +    base_alignment = base_addr_alignment;
>>> >
>>> >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>>> >      DR_VECT_AUX (dr)->base_element_aligned = true;
>>>
>>> Since you committed this patch (r241892), I'm seeing execution failures:
>>>   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
>>>   gcc.dg/vect/pr40074.c execution test
>>> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
>>> --with-fpu=neon-fp16
>>> (using qemu as simulator)
>>
>> The difference is that we now vectorize the testcase with versioning
>> for alignment (but it should never execute the vectorized variant).
>> I need arm peoples help to understand what is wrong.
> Hi All,
> I will look at it.
>

Hi,

This is causing new regressions on armeb:
  gcc.dg/vect/vect-strided-a-u8-i2-gap.c -flto -ffat-lto-objects
scan-tree-dump-times vect "vectorized 2 loops" 1
  gcc.dg/vect/vect-strided-a-u8-i2-gap.c scan-tree-dump-times vect
"vectorized 2 loops" 1

pr470074 passes now,

Thanks,

Christophe

> Thanks,
> bin
>>
>> At least the testcase shows there is (kind-of) a missed optimization
>> that we no longer figure out versioning for alignment is useless.
>> I'll look into that.
>>
>> Richard.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2016-11-08  9:12   ` Richard Biener
  2016-11-08  9:23     ` Richard Biener
@ 2016-11-09  8:37     ` Bin.Cheng
  2016-11-09 18:16       ` Christophe Lyon
  1 sibling, 1 reply; 15+ messages in thread
From: Bin.Cheng @ 2016-11-09  8:37 UTC (permalink / raw)
  To: Richard Biener; +Cc: Christophe Lyon, gcc-patches

On Tue, Nov 8, 2016 at 9:11 AM, Richard Biener <rguenther@suse.de> wrote:
> On Mon, 7 Nov 2016, Christophe Lyon wrote:
>
>> Hi Richard,
>>
>>
>> On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
>> >
>> > The following fixes an oversight when computing alignment in the
>> > vectorizer.
>> >
>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
>> >
>> > Richard.
>> >
>> > 2016-11-07  Richard Biener  <rguenther@suse.de>
>> >
>> >         PR tree-optimization/78189
>> >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
>> >         alignment computation.
>> >
>> >         * g++.dg/torture/pr78189.C: New testcase.
>> >
>> > Index: gcc/testsuite/g++.dg/torture/pr78189.C
>> > ===================================================================
>> > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
>> > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
>> > @@ -0,0 +1,41 @@
>> > +/* { dg-do run } */
>> > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
>> > +
>> > +#include <cstddef>
>> > +
>> > +struct A
>> > +{
>> > +  void * a;
>> > +  void * b;
>> > +};
>> > +
>> > +struct alignas(16) B
>> > +{
>> > +  void * pad;
>> > +  void * misaligned;
>> > +  void * pad2;
>> > +
>> > +  A a;
>> > +
>> > +  void Null();
>> > +};
>> > +
>> > +void B::Null()
>> > +{
>> > +  a.a = nullptr;
>> > +  a.b = nullptr;
>> > +}
>> > +
>> > +void __attribute__((noinline,noclone))
>> > +NullB(void * misalignedPtr)
>> > +{
>> > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
>> > +  b->Null();
>> > +}
>> > +
>> > +int main()
>> > +{
>> > +  B b;
>> > +  NullB(&b.misaligned);
>> > +  return 0;
>> > +}
>> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>> > index 9346cfe..b03cb1e 100644
>> > --- gcc/tree-vect-data-refs.c
>> > +++ gcc/tree-vect-data-refs.c
>> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
>> >    base = ref;
>> >    while (handled_component_p (base))
>> >      base = TREE_OPERAND (base, 0);
>> > +  unsigned int base_alignment;
>> > +  unsigned HOST_WIDE_INT base_bitpos;
>> > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>> > +  /* As data-ref analysis strips the MEM_REF down to its base operand
>> > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>> > +     adjust things to make base_alignment valid as the alignment of
>> > +     DR_BASE_ADDRESS.  */
>> >    if (TREE_CODE (base) == MEM_REF)
>> > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
>> > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
>> > -  unsigned int base_alignment = get_object_alignment (base);
>> > +    {
>> > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>> > +      base_bitpos &= (base_alignment - 1);
>> > +    }
>> > +  if (base_bitpos != 0)
>> > +    base_alignment = base_bitpos & -base_bitpos;
>> > +  /* Also look at the alignment of the base address DR analysis
>> > +     computed.  */
>> > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>> > +  if (base_addr_alignment > base_alignment)
>> > +    base_alignment = base_addr_alignment;
>> >
>> >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>> >      DR_VECT_AUX (dr)->base_element_aligned = true;
>>
>> Since you committed this patch (r241892), I'm seeing execution failures:
>>   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
>>   gcc.dg/vect/pr40074.c execution test
>> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
>> --with-fpu=neon-fp16
>> (using qemu as simulator)
>
> The difference is that we now vectorize the testcase with versioning
> for alignment (but it should never execute the vectorized variant).
> I need arm peoples help to understand what is wrong.
Hi All,
I will look at it.

Thanks,
bin
>
> At least the testcase shows there is (kind-of) a missed optimization
> that we no longer figure out versioning for alignment is useless.
> I'll look into that.
>
> Richard.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2016-11-08  9:23     ` Richard Biener
@ 2016-11-09  8:08       ` Richard Biener
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Biener @ 2016-11-09  8:08 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: gcc-patches

On Tue, 8 Nov 2016, Richard Biener wrote:

> On Tue, 8 Nov 2016, Richard Biener wrote:
> 
> > On Mon, 7 Nov 2016, Christophe Lyon wrote:
> > 
> > > Hi Richard,
> > > 
> > > 
> > > On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
> > > >
> > > > The following fixes an oversight when computing alignment in the
> > > > vectorizer.
> > > >
> > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> > > >
> > > > Richard.
> > > >
> > > > 2016-11-07  Richard Biener  <rguenther@suse.de>
> > > >
> > > >         PR tree-optimization/78189
> > > >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
> > > >         alignment computation.
> > > >
> > > >         * g++.dg/torture/pr78189.C: New testcase.
> > > >
> > > > Index: gcc/testsuite/g++.dg/torture/pr78189.C
> > > > ===================================================================
> > > > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
> > > > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
> > > > @@ -0,0 +1,41 @@
> > > > +/* { dg-do run } */
> > > > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
> > > > +
> > > > +#include <cstddef>
> > > > +
> > > > +struct A
> > > > +{
> > > > +  void * a;
> > > > +  void * b;
> > > > +};
> > > > +
> > > > +struct alignas(16) B
> > > > +{
> > > > +  void * pad;
> > > > +  void * misaligned;
> > > > +  void * pad2;
> > > > +
> > > > +  A a;
> > > > +
> > > > +  void Null();
> > > > +};
> > > > +
> > > > +void B::Null()
> > > > +{
> > > > +  a.a = nullptr;
> > > > +  a.b = nullptr;
> > > > +}
> > > > +
> > > > +void __attribute__((noinline,noclone))
> > > > +NullB(void * misalignedPtr)
> > > > +{
> > > > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
> > > > +  b->Null();
> > > > +}
> > > > +
> > > > +int main()
> > > > +{
> > > > +  B b;
> > > > +  NullB(&b.misaligned);
> > > > +  return 0;
> > > > +}
> > > > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> > > > index 9346cfe..b03cb1e 100644
> > > > --- gcc/tree-vect-data-refs.c
> > > > +++ gcc/tree-vect-data-refs.c
> > > > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
> > > >    base = ref;
> > > >    while (handled_component_p (base))
> > > >      base = TREE_OPERAND (base, 0);
> > > > +  unsigned int base_alignment;
> > > > +  unsigned HOST_WIDE_INT base_bitpos;
> > > > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
> > > > +  /* As data-ref analysis strips the MEM_REF down to its base operand
> > > > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> > > > +     adjust things to make base_alignment valid as the alignment of
> > > > +     DR_BASE_ADDRESS.  */
> > > >    if (TREE_CODE (base) == MEM_REF)
> > > > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
> > > > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
> > > > -  unsigned int base_alignment = get_object_alignment (base);
> > > > +    {
> > > > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
> > > > +      base_bitpos &= (base_alignment - 1);
> > > > +    }
> > > > +  if (base_bitpos != 0)
> > > > +    base_alignment = base_bitpos & -base_bitpos;
> > > > +  /* Also look at the alignment of the base address DR analysis
> > > > +     computed.  */
> > > > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> > > > +  if (base_addr_alignment > base_alignment)
> > > > +    base_alignment = base_addr_alignment;
> > > >
> > > >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
> > > >      DR_VECT_AUX (dr)->base_element_aligned = true;
> > > 
> > > Since you committed this patch (r241892), I'm seeing execution failures:
> > >   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
> > >   gcc.dg/vect/pr40074.c execution test
> > > on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
> > > --with-fpu=neon-fp16
> > > (using qemu as simulator)
> > 
> > The difference is that we now vectorize the testcase with versioning
> > for alignment (but it should never execute the vectorized variant).
> > I need arm peoples help to understand what is wrong.
> > 
> > At least the testcase shows there is (kind-of) a missed optimization
> > that we no longer figure out versioning for alignment is useless.
> > I'll look into that.
> 
> I'm testing the following which restores previous behavior.  You
> might still want to figure out what went wrong.

This is what I have applied.

Bootstrapped / regtested on x86_64-unknown-linux-gnu.

Richard.

2016-11-09  Richard Biener  <rguenther@suse.de>

	* tree-vect-data-refs.c (vect_compute_data_ref_alignment):
	Look at the DR_BASE_ADDRESS object for forcing alignment.

Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c	(revision 241990)
+++ gcc/tree-vect-data-refs.c	(working copy)
@@ -813,12 +813,9 @@ vect_compute_data_ref_alignment (struct
 
   if (base_alignment < TYPE_ALIGN (vectype))
     {
-      /* Strip an inner MEM_REF to a bare decl if possible.  */
-      if (TREE_CODE (base) == MEM_REF
-	  && integer_zerop (TREE_OPERAND (base, 1))
-	  && TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
-	base = TREE_OPERAND (TREE_OPERAND (base, 0), 0);
-
+      base = base_addr;
+      if (TREE_CODE (base) == ADDR_EXPR)
+	base = TREE_OPERAND (base, 0);
       if (!vect_can_force_dr_alignment_p (base, TYPE_ALIGN (vectype)))
 	{
 	  if (dump_enabled_p ())

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2016-11-08  9:12   ` Richard Biener
@ 2016-11-08  9:23     ` Richard Biener
  2016-11-09  8:08       ` Richard Biener
  2016-11-09  8:37     ` Bin.Cheng
  1 sibling, 1 reply; 15+ messages in thread
From: Richard Biener @ 2016-11-08  9:23 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: gcc-patches

On Tue, 8 Nov 2016, Richard Biener wrote:

> On Mon, 7 Nov 2016, Christophe Lyon wrote:
> 
> > Hi Richard,
> > 
> > 
> > On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
> > >
> > > The following fixes an oversight when computing alignment in the
> > > vectorizer.
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> > >
> > > Richard.
> > >
> > > 2016-11-07  Richard Biener  <rguenther@suse.de>
> > >
> > >         PR tree-optimization/78189
> > >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
> > >         alignment computation.
> > >
> > >         * g++.dg/torture/pr78189.C: New testcase.
> > >
> > > Index: gcc/testsuite/g++.dg/torture/pr78189.C
> > > ===================================================================
> > > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
> > > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
> > > @@ -0,0 +1,41 @@
> > > +/* { dg-do run } */
> > > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
> > > +
> > > +#include <cstddef>
> > > +
> > > +struct A
> > > +{
> > > +  void * a;
> > > +  void * b;
> > > +};
> > > +
> > > +struct alignas(16) B
> > > +{
> > > +  void * pad;
> > > +  void * misaligned;
> > > +  void * pad2;
> > > +
> > > +  A a;
> > > +
> > > +  void Null();
> > > +};
> > > +
> > > +void B::Null()
> > > +{
> > > +  a.a = nullptr;
> > > +  a.b = nullptr;
> > > +}
> > > +
> > > +void __attribute__((noinline,noclone))
> > > +NullB(void * misalignedPtr)
> > > +{
> > > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
> > > +  b->Null();
> > > +}
> > > +
> > > +int main()
> > > +{
> > > +  B b;
> > > +  NullB(&b.misaligned);
> > > +  return 0;
> > > +}
> > > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> > > index 9346cfe..b03cb1e 100644
> > > --- gcc/tree-vect-data-refs.c
> > > +++ gcc/tree-vect-data-refs.c
> > > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
> > >    base = ref;
> > >    while (handled_component_p (base))
> > >      base = TREE_OPERAND (base, 0);
> > > +  unsigned int base_alignment;
> > > +  unsigned HOST_WIDE_INT base_bitpos;
> > > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
> > > +  /* As data-ref analysis strips the MEM_REF down to its base operand
> > > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> > > +     adjust things to make base_alignment valid as the alignment of
> > > +     DR_BASE_ADDRESS.  */
> > >    if (TREE_CODE (base) == MEM_REF)
> > > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
> > > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
> > > -  unsigned int base_alignment = get_object_alignment (base);
> > > +    {
> > > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
> > > +      base_bitpos &= (base_alignment - 1);
> > > +    }
> > > +  if (base_bitpos != 0)
> > > +    base_alignment = base_bitpos & -base_bitpos;
> > > +  /* Also look at the alignment of the base address DR analysis
> > > +     computed.  */
> > > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> > > +  if (base_addr_alignment > base_alignment)
> > > +    base_alignment = base_addr_alignment;
> > >
> > >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
> > >      DR_VECT_AUX (dr)->base_element_aligned = true;
> > 
> > Since you committed this patch (r241892), I'm seeing execution failures:
> >   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
> >   gcc.dg/vect/pr40074.c execution test
> > on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
> > --with-fpu=neon-fp16
> > (using qemu as simulator)
> 
> The difference is that we now vectorize the testcase with versioning
> for alignment (but it should never execute the vectorized variant).
> I need arm peoples help to understand what is wrong.
> 
> At least the testcase shows there is (kind-of) a missed optimization
> that we no longer figure out versioning for alignment is useless.
> I'll look into that.

I'm testing the following which restores previous behavior.  You
might still want to figure out what went wrong.

Richard.

2016-11-08  Richard Biener  <rguenther@suse.de>

	* tree-vect-data-refs.c (vect_compute_data_ref_alignment):
	Look at the DR_BASE_ADDRESS object for forcing alignment.

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index b03cb1e..48995f4 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -813,12 +813,7 @@ vect_compute_data_ref_alignment (struct 
data_reference *dr)
 
   if (base_alignment < TYPE_ALIGN (vectype))
     {
-      /* Strip an inner MEM_REF to a bare decl if possible.  */
-      if (TREE_CODE (base) == MEM_REF
-         && integer_zerop (TREE_OPERAND (base, 1))
-         && TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
-       base = TREE_OPERAND (TREE_OPERAND (base, 0), 0);
-
+      base = TREE_OPERAND (base_addr, 0);
       if (!vect_can_force_dr_alignment_p (base, TYPE_ALIGN (vectype)))
        {
          if (dump_enabled_p ())

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2016-11-07 20:40 ` Christophe Lyon
@ 2016-11-08  9:12   ` Richard Biener
  2016-11-08  9:23     ` Richard Biener
  2016-11-09  8:37     ` Bin.Cheng
  0 siblings, 2 replies; 15+ messages in thread
From: Richard Biener @ 2016-11-08  9:12 UTC (permalink / raw)
  To: Christophe Lyon; +Cc: gcc-patches

On Mon, 7 Nov 2016, Christophe Lyon wrote:

> Hi Richard,
> 
> 
> On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
> >
> > The following fixes an oversight when computing alignment in the
> > vectorizer.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> >
> > Richard.
> >
> > 2016-11-07  Richard Biener  <rguenther@suse.de>
> >
> >         PR tree-optimization/78189
> >         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
> >         alignment computation.
> >
> >         * g++.dg/torture/pr78189.C: New testcase.
> >
> > Index: gcc/testsuite/g++.dg/torture/pr78189.C
> > ===================================================================
> > --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
> > +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
> > @@ -0,0 +1,41 @@
> > +/* { dg-do run } */
> > +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
> > +
> > +#include <cstddef>
> > +
> > +struct A
> > +{
> > +  void * a;
> > +  void * b;
> > +};
> > +
> > +struct alignas(16) B
> > +{
> > +  void * pad;
> > +  void * misaligned;
> > +  void * pad2;
> > +
> > +  A a;
> > +
> > +  void Null();
> > +};
> > +
> > +void B::Null()
> > +{
> > +  a.a = nullptr;
> > +  a.b = nullptr;
> > +}
> > +
> > +void __attribute__((noinline,noclone))
> > +NullB(void * misalignedPtr)
> > +{
> > +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
> > +  b->Null();
> > +}
> > +
> > +int main()
> > +{
> > +  B b;
> > +  NullB(&b.misaligned);
> > +  return 0;
> > +}
> > diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> > index 9346cfe..b03cb1e 100644
> > --- gcc/tree-vect-data-refs.c
> > +++ gcc/tree-vect-data-refs.c
> > @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
> >    base = ref;
> >    while (handled_component_p (base))
> >      base = TREE_OPERAND (base, 0);
> > +  unsigned int base_alignment;
> > +  unsigned HOST_WIDE_INT base_bitpos;
> > +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
> > +  /* As data-ref analysis strips the MEM_REF down to its base operand
> > +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> > +     adjust things to make base_alignment valid as the alignment of
> > +     DR_BASE_ADDRESS.  */
> >    if (TREE_CODE (base) == MEM_REF)
> > -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
> > -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
> > -  unsigned int base_alignment = get_object_alignment (base);
> > +    {
> > +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
> > +      base_bitpos &= (base_alignment - 1);
> > +    }
> > +  if (base_bitpos != 0)
> > +    base_alignment = base_bitpos & -base_bitpos;
> > +  /* Also look at the alignment of the base address DR analysis
> > +     computed.  */
> > +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> > +  if (base_addr_alignment > base_alignment)
> > +    base_alignment = base_addr_alignment;
> >
> >    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
> >      DR_VECT_AUX (dr)->base_element_aligned = true;
> 
> Since you committed this patch (r241892), I'm seeing execution failures:
>   gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
>   gcc.dg/vect/pr40074.c execution test
> on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
> --with-fpu=neon-fp16
> (using qemu as simulator)

The difference is that we now vectorize the testcase with versioning
for alignment (but it should never execute the vectorized variant).
I need arm peoples help to understand what is wrong.

At least the testcase shows there is (kind-of) a missed optimization
that we no longer figure out versioning for alignment is useless.
I'll look into that.

Richard.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] Fix PR78189
  2016-11-07  8:02 Richard Biener
@ 2016-11-07 20:40 ` Christophe Lyon
  2016-11-08  9:12   ` Richard Biener
  0 siblings, 1 reply; 15+ messages in thread
From: Christophe Lyon @ 2016-11-07 20:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi Richard,


On 7 November 2016 at 09:01, Richard Biener <rguenther@suse.de> wrote:
>
> The following fixes an oversight when computing alignment in the
> vectorizer.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
>
> Richard.
>
> 2016-11-07  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/78189
>         * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
>         alignment computation.
>
>         * g++.dg/torture/pr78189.C: New testcase.
>
> Index: gcc/testsuite/g++.dg/torture/pr78189.C
> ===================================================================
> --- gcc/testsuite/g++.dg/torture/pr78189.C      (revision 0)
> +++ gcc/testsuite/g++.dg/torture/pr78189.C      (working copy)
> @@ -0,0 +1,41 @@
> +/* { dg-do run } */
> +/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
> +
> +#include <cstddef>
> +
> +struct A
> +{
> +  void * a;
> +  void * b;
> +};
> +
> +struct alignas(16) B
> +{
> +  void * pad;
> +  void * misaligned;
> +  void * pad2;
> +
> +  A a;
> +
> +  void Null();
> +};
> +
> +void B::Null()
> +{
> +  a.a = nullptr;
> +  a.b = nullptr;
> +}
> +
> +void __attribute__((noinline,noclone))
> +NullB(void * misalignedPtr)
> +{
> +  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
> +  b->Null();
> +}
> +
> +int main()
> +{
> +  B b;
> +  NullB(&b.misaligned);
> +  return 0;
> +}
> diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index 9346cfe..b03cb1e 100644
> --- gcc/tree-vect-data-refs.c
> +++ gcc/tree-vect-data-refs.c
> @@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
>    base = ref;
>    while (handled_component_p (base))
>      base = TREE_OPERAND (base, 0);
> +  unsigned int base_alignment;
> +  unsigned HOST_WIDE_INT base_bitpos;
> +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
> +  /* As data-ref analysis strips the MEM_REF down to its base operand
> +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> +     adjust things to make base_alignment valid as the alignment of
> +     DR_BASE_ADDRESS.  */
>    if (TREE_CODE (base) == MEM_REF)
> -    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
> -                  build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
> -  unsigned int base_alignment = get_object_alignment (base);
> +    {
> +      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
> +      base_bitpos &= (base_alignment - 1);
> +    }
> +  if (base_bitpos != 0)
> +    base_alignment = base_bitpos & -base_bitpos;
> +  /* Also look at the alignment of the base address DR analysis
> +     computed.  */
> +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> +  if (base_addr_alignment > base_alignment)
> +    base_alignment = base_addr_alignment;
>
>    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>      DR_VECT_AUX (dr)->base_element_aligned = true;

Since you committed this patch (r241892), I'm seeing execution failures:
  gcc.dg/vect/pr40074.c -flto -ffat-lto-objects execution test
  gcc.dg/vect/pr40074.c execution test
on armeb-none-linux-gnueabihf --with-mode=arm --with-cpu=cortex-a9
--with-fpu=neon-fp16
(using qemu as simulator)

Christophe

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] Fix PR78189
@ 2016-11-07  8:02 Richard Biener
  2016-11-07 20:40 ` Christophe Lyon
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Biener @ 2016-11-07  8:02 UTC (permalink / raw)
  To: gcc-patches


The following fixes an oversight when computing alignment in the
vectorizer.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-11-07  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/78189
	* tree-vect-data-refs.c (vect_compute_data_ref_alignment): Fix
	alignment computation.

	* g++.dg/torture/pr78189.C: New testcase.

Index: gcc/testsuite/g++.dg/torture/pr78189.C
===================================================================
--- gcc/testsuite/g++.dg/torture/pr78189.C	(revision 0)
+++ gcc/testsuite/g++.dg/torture/pr78189.C	(working copy)
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-slp-vectorize -fno-vect-cost-model" } */
+
+#include <cstddef>
+
+struct A
+{
+  void * a;
+  void * b;
+};
+
+struct alignas(16) B
+{
+  void * pad;
+  void * misaligned;
+  void * pad2;
+
+  A a;
+
+  void Null();
+};
+
+void B::Null()
+{
+  a.a = nullptr;
+  a.b = nullptr;
+}
+
+void __attribute__((noinline,noclone))
+NullB(void * misalignedPtr)
+{
+  B* b = reinterpret_cast<B*>(reinterpret_cast<char *>(misalignedPtr) - offsetof(B, misaligned));
+  b->Null();
+}
+
+int main()
+{
+  B b;
+  NullB(&b.misaligned);
+  return 0;
+}
diff --git gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 9346cfe..b03cb1e 100644
--- gcc/tree-vect-data-refs.c
+++ gcc/tree-vect-data-refs.c
@@ -773,10 +773,25 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
   base = ref;
   while (handled_component_p (base))
     base = TREE_OPERAND (base, 0);
+  unsigned int base_alignment;
+  unsigned HOST_WIDE_INT base_bitpos;
+  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
+  /* As data-ref analysis strips the MEM_REF down to its base operand
+     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
+     adjust things to make base_alignment valid as the alignment of
+     DR_BASE_ADDRESS.  */
   if (TREE_CODE (base) == MEM_REF)
-    base = build2 (MEM_REF, TREE_TYPE (base), base_addr,
-		   build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), 0));
-  unsigned int base_alignment = get_object_alignment (base);
+    {
+      base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
+      base_bitpos &= (base_alignment - 1);
+    }
+  if (base_bitpos != 0)
+    base_alignment = base_bitpos & -base_bitpos;
+  /* Also look at the alignment of the base address DR analysis
+     computed.  */
+  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
+  if (base_addr_alignment > base_alignment)
+    base_alignment = base_addr_alignment;
 
   if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
     DR_VECT_AUX (dr)->base_element_aligned = true;

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-01-25 10:22 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-20 16:35 [PATCH] Fix PR78189 Nick Clifton
2017-01-23  9:05 ` Richard Biener
2017-01-23 19:45   ` Christophe Lyon
2017-01-25 10:28     ` Kyrill Tkachov
  -- strict thread matches above, loose matches on Subject: below --
2016-11-07  8:02 Richard Biener
2016-11-07 20:40 ` Christophe Lyon
2016-11-08  9:12   ` Richard Biener
2016-11-08  9:23     ` Richard Biener
2016-11-09  8:08       ` Richard Biener
2016-11-09  8:37     ` Bin.Cheng
2016-11-09 18:16       ` Christophe Lyon
2016-11-10  8:34         ` Richard Biener
2016-11-10 14:17           ` Christophe Lyon
2016-11-11  8:56             ` Richard Biener
2016-11-18 13:12               ` Christophe Lyon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).