[gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
@ 2013-10-31 15:44 Yuri Rumyantsev
  2013-10-31 16:19 ` Jakub Jelinek
  0 siblings, 1 reply; 44+ messages in thread
From: Yuri Rumyantsev @ 2013-10-31 15:44 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Igor Zamyatin, Areg Melik-Adamyan

[-- Attachment #1: Type: text/plain, Size: 415 bytes --]

Hi All,

Here is a simple fix which allows to vectorize loop marked with
'pragma omp simd' even if cost model tells us that vectorization is
not profitable.
I checked that on simple test-case is works as expected.

Is it Ok for trunk?

ChangeLog:

2013-10-31  Yuri Rumyantsev  <ysrumyan@gmail.com>

* tree-vect-loop.c (vect_estimate_min_profitable_iters): Override
cost estimation for loops marked as vectorizable.

[-- Attachment #2: patch --]
[-- Type: application/octet-stream, Size: 1541 bytes --]

Index: tree-vect-loop.c
===================================================================
--- tree-vect-loop.c	(revision 204142)
+++ tree-vect-loop.c	(working copy)
@@ -2929,16 +2929,31 @@
   /* vector version will never be profitable.  */
   else
     {
-      if (dump_enabled_p ())
-        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-			 "cost model: the vector iteration cost = %d "
-			 "divided by the scalar iteration cost = %d "
-			 "is greater or equal to the vectorization factor = %d"
-                         ".\n",
-			 vec_inside_cost, scalar_single_iter_cost, vf);
-      *ret_min_profitable_niters = -1;
-      *ret_min_profitable_estimate = -1;
-      return;
+      if (loop_vinfo->loop->force_vect)
+	{
+	  min_profitable_iters = vf;
+	  /* Pragma omp simd was specified, skip cost estimation. */
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "Override non-profitable vectorization "
+			     "estimation since loop was marked for "
+			     "vectorization.\n");
+	}
+      else
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			     "cost model: the vector iteration cost = %d "
+			     "divided by the scalar iteration cost = %d "
+			     "is greater or equal to the "
+			     "vectorization factor = %d"
+			     ".\n",
+			     vec_inside_cost, scalar_single_iter_cost, vf);
+      
+	  *ret_min_profitable_niters = -1;
+	  *ret_min_profitable_estimate = -1;
+	  return;
+	}
     }
 
   if (dump_enabled_p ())

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-10-31 15:44 [gomp4 simd, RFC] Simple fix to override vectorization cost estimation Yuri Rumyantsev
@ 2013-10-31 16:19 ` Jakub Jelinek
  2013-10-31 19:10   ` Richard Biener
  0 siblings, 1 reply; 44+ messages in thread
From: Jakub Jelinek @ 2013-10-31 16:19 UTC (permalink / raw)
  To: Yuri Rumyantsev, Richard Biener
  Cc: gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

On Thu, Oct 31, 2013 at 07:02:28PM +0400, Yuri Rumyantsev wrote:
> Here is a simple fix which allows to vectorize loop marked with
> 'pragma omp simd' even if cost model tells us that vectorization is
> not profitable.
> I checked that on simple test-case is works as expected.
> 
> Is it Ok for trunk?
> 
> ChangeLog:
> 
> 2013-10-31  Yuri Rumyantsev  <ysrumyan@gmail.com>
> 
> * tree-vect-loop.c (vect_estimate_min_profitable_iters): Override
> cost estimation for loops marked as vectorizable.

That looks too simplistics, IMHO it is undesirable to disregard the
profitability checks together.  For #pragma omp simd or #pragma simd
loops, I can understand that we should admit our cost model is not very high
quality and so in border cases consider vectorizing rather than not
vectorizing, say for force_vect by increasing the scalar cost by some
factor or decreasing vector cost by some factor, but disregarding it
altogether doesn't look wise.  The question is what factor should we use?
150% of scalar cost, something else?

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-10-31 16:19 ` Jakub Jelinek
@ 2013-10-31 19:10   ` Richard Biener
  2013-11-12 13:18     ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Richard Biener @ 2013-10-31 19:10 UTC (permalink / raw)
  To: Jakub Jelinek, Jakub Jelinek, Yuri Rumyantsev
  Cc: gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

Jakub Jelinek <jakub@redhat.com> wrote:
>On Thu, Oct 31, 2013 at 07:02:28PM +0400, Yuri Rumyantsev wrote:
>> Here is a simple fix which allows to vectorize loop marked with
>> 'pragma omp simd' even if cost model tells us that vectorization is
>> not profitable.
>> I checked that on simple test-case is works as expected.
>> 
>> Is it Ok for trunk?
>> 
>> ChangeLog:
>> 
>> 2013-10-31  Yuri Rumyantsev  <ysrumyan@gmail.com>
>> 
>> * tree-vect-loop.c (vect_estimate_min_profitable_iters): Override
>> cost estimation for loops marked as vectorizable.
>
>That looks too simplistics, IMHO it is undesirable to disregard the
>profitability checks together.  For #pragma omp simd or #pragma simd
>loops, I can understand that we should admit our cost model is not very
>high
>quality and so in border cases consider vectorizing rather than not
>vectorizing, say for force_vect by increasing the scalar cost by some
>factor or decreasing vector cost by some factor, but disregarding it
>altogether doesn't look wise.  The question is what factor should we
>use?
>150% of scalar cost, something else?

Please improve the cost-model instead.

Thanks,
Richard.

>	Jakub


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-10-31 19:10   ` Richard Biener
@ 2013-11-12 13:18     ` Sergey Ostanevich
  2013-11-12 13:46       ` Jakub Jelinek
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-12 13:18 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, Yuri Rumyantsev, gcc-patches, Igor Zamyatin,
	Areg Melik-Adamyan

Richard, Jakub,

You are right regarding the cost model if we talk about vectorizer alone.
But the #pragma omp simd goes beyond the vectorizer - it introduces
parallel context in a place user defines, similar to #pragma omp parallel.
Are we applying any cost model for omp parallel region?

You can consider this pragma as a helper for developer for 'easily'
introduce parallelism in his code, hence any type of cost model -
whatever quality it is - will plays against this paradigm, forcing user
to play around our cost model to let it make the loop simd-parallel.

Sergos

On Thu, Oct 31, 2013 at 10:36 PM, Richard Biener <rguenther@suse.de> wrote:
> Jakub Jelinek <jakub@redhat.com> wrote:
>>On Thu, Oct 31, 2013 at 07:02:28PM +0400, Yuri Rumyantsev wrote:
>>> Here is a simple fix which allows to vectorize loop marked with
>>> 'pragma omp simd' even if cost model tells us that vectorization is
>>> not profitable.
>>> I checked that on simple test-case is works as expected.
>>>
>>> Is it Ok for trunk?
>>>
>>> ChangeLog:
>>>
>>> 2013-10-31  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>
>>> * tree-vect-loop.c (vect_estimate_min_profitable_iters): Override
>>> cost estimation for loops marked as vectorizable.
>>
>>That looks too simplistics, IMHO it is undesirable to disregard the
>>profitability checks together.  For #pragma omp simd or #pragma simd
>>loops, I can understand that we should admit our cost model is not very
>>high
>>quality and so in border cases consider vectorizing rather than not
>>vectorizing, say for force_vect by increasing the scalar cost by some
>>factor or decreasing vector cost by some factor, but disregarding it
>>altogether doesn't look wise.  The question is what factor should we
>>use?
>>150% of scalar cost, something else?
>
> Please improve the cost-model instead.
>
> Thanks,
> Richard.
>
>>       Jakub
>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-12 13:18     ` Sergey Ostanevich
@ 2013-11-12 13:46       ` Jakub Jelinek
  2013-11-12 14:16         ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Jakub Jelinek @ 2013-11-12 13:46 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Richard Biener, Yuri Rumyantsev, gcc-patches, Igor Zamyatin,
	Areg Melik-Adamyan

On Tue, Nov 12, 2013 at 02:48:46PM +0400, Sergey Ostanevich wrote:
> You are right regarding the cost model if we talk about vectorizer alone.
> But the #pragma omp simd goes beyond the vectorizer - it introduces
> parallel context in a place user defines, similar to #pragma omp parallel.
> Are we applying any cost model for omp parallel region?
> 
> You can consider this pragma as a helper for developer for 'easily'
> introduce parallelism in his code, hence any type of cost model -
> whatever quality it is - will plays against this paradigm, forcing user
> to play around our cost model to let it make the loop simd-parallel.

So what do other compilers do here?  Does icc also totally ignore all
cost analysis for #pragma omp simd or #pragma simd and vectorizes even when
it would be obviously undesirable?

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-12 13:46       ` Jakub Jelinek
@ 2013-11-12 14:16         ` Sergey Ostanevich
  2013-11-12 14:28           ` Jakub Jelinek
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-12 14:16 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Yuri Rumyantsev, gcc-patches, Igor Zamyatin,
	Areg Melik-Adamyan

yes, ICC ignores cost analysis and follows user request on introduction of
simd parallelism in the loop.they follow the omp parallel semantics.

On Tue, Nov 12, 2013 at 3:05 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Nov 12, 2013 at 02:48:46PM +0400, Sergey Ostanevich wrote:
>> You are right regarding the cost model if we talk about vectorizer alone.
>> But the #pragma omp simd goes beyond the vectorizer - it introduces
>> parallel context in a place user defines, similar to #pragma omp parallel.
>> Are we applying any cost model for omp parallel region?
>>
>> You can consider this pragma as a helper for developer for 'easily'
>> introduce parallelism in his code, hence any type of cost model -
>> whatever quality it is - will plays against this paradigm, forcing user
>> to play around our cost model to let it make the loop simd-parallel.
>
> So what do other compilers do here?  Does icc also totally ignore all
> cost analysis for #pragma omp simd or #pragma simd and vectorizes even when
> it would be obviously undesirable?
>
>         Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-12 14:16         ` Sergey Ostanevich
@ 2013-11-12 14:28           ` Jakub Jelinek
  2013-11-12 14:49             ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Jakub Jelinek @ 2013-11-12 14:28 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Richard Biener, Yuri Rumyantsev, gcc-patches, Igor Zamyatin,
	Areg Melik-Adamyan

On Tue, Nov 12, 2013 at 04:45:17PM +0400, Sergey Ostanevich wrote:
> yes, ICC ignores cost analysis and follows user request on introduction of
> simd parallelism in the loop.they follow the omp parallel semantics.

What about #pragma ivdep?  I.e. if we decided to follow ICC here, should
we ignore cost model just for loop->safelen && loop->force_vect loops
(or only loop->force_vect), or also for any other loop->safelen loops?

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-12 14:28           ` Jakub Jelinek
@ 2013-11-12 14:49             ` Sergey Ostanevich
  2013-11-12 15:16               ` Jakub Jelinek
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-12 14:49 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Yuri Rumyantsev, gcc-patches, Igor Zamyatin,
	Areg Melik-Adamyan

ivdep just substitutes all cross-iteration data analysis,
nothing related to cost model. ICC does not cancel its
cost model in case of #pragma ivdep

as for the safelen - OMP standart treats it as a limitation
for the vector length. this means if no safelen is present
an arbitrary vector length can be used.
so I believe loop->force_vect is the only trigger to disregard
the cost model

On Tue, Nov 12, 2013 at 4:48 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Nov 12, 2013 at 04:45:17PM +0400, Sergey Ostanevich wrote:
>> yes, ICC ignores cost analysis and follows user request on introduction of
>> simd parallelism in the loop.they follow the omp parallel semantics.
>
> What about #pragma ivdep?  I.e. if we decided to follow ICC here, should
> we ignore cost model just for loop->safelen && loop->force_vect loops
> (or only loop->force_vect), or also for any other loop->safelen loops?
>
>         Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-12 14:49             ` Sergey Ostanevich
@ 2013-11-12 15:16               ` Jakub Jelinek
  2013-11-12 15:39                 ` Richard Biener
  0 siblings, 1 reply; 44+ messages in thread
From: Jakub Jelinek @ 2013-11-12 15:16 UTC (permalink / raw)
  To: Sergey Ostanevich, Richard Henderson
  Cc: Richard Biener, Yuri Rumyantsev, gcc-patches, Igor Zamyatin,
	Areg Melik-Adamyan

On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich wrote:
> ivdep just substitutes all cross-iteration data analysis,
> nothing related to cost model. ICC does not cancel its
> cost model in case of #pragma ivdep
> 
> as for the safelen - OMP standart treats it as a limitation
> for the vector length. this means if no safelen is present
> an arbitrary vector length can be used.

I was talking about GCC loop->safelen, which is INT_MAX for #pragma omp simd
without safelen clause or #pragma simd without vectorlength clause.

> so I believe loop->force_vect is the only trigger to disregard
> the cost model

Anyway, in that case I think the originally posted patch is wrong,
if we want to treat force_vect as disregard all the cost model and
force vectorization (well, the name of the field already kind of suggest
that), then IMHO we should treat it the same as -fvect-cost-model=unlimited
for those loops.

Thus (untested):

2013-11-12  Jakub Jelinek  <jakub@redhat.com>

	* tree-vect-loop.c (vect_estimate_min_profitable_iters): Use
	unlimited cost model also for force_vect loops.

--- gcc/tree-vect-loop.c.jj	2013-11-12 12:09:40.000000000 +0100
+++ gcc/tree-vect-loop.c	2013-11-12 15:11:43.821404330 +0100
@@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
   void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
 
   /* Cost model disabled.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model () || LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
     {
       dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
       *ret_min_profitable_niters = 0;

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-12 15:16               ` Jakub Jelinek
@ 2013-11-12 15:39                 ` Richard Biener
  2013-11-12 16:15                   ` Jakub Jelinek
  2013-11-12 18:59                   ` Sergey Ostanevich
  0 siblings, 2 replies; 44+ messages in thread
From: Richard Biener @ 2013-11-12 15:39 UTC (permalink / raw)
  To: Jakub Jelinek, Sergey Ostanevich, Richard Henderson
  Cc: Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

On 11/12/13 3:16 PM, Jakub Jelinek wrote:
> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich wrote:
>> ivdep just substitutes all cross-iteration data analysis,
>> nothing related to cost model. ICC does not cancel its
>> cost model in case of #pragma ivdep
>>
>> as for the safelen - OMP standart treats it as a limitation
>> for the vector length. this means if no safelen is present
>> an arbitrary vector length can be used.
> 
> I was talking about GCC loop->safelen, which is INT_MAX for #pragma omp simd
> without safelen clause or #pragma simd without vectorlength clause.
> 
>> so I believe loop->force_vect is the only trigger to disregard
>> the cost model
> 
> Anyway, in that case I think the originally posted patch is wrong,
> if we want to treat force_vect as disregard all the cost model and
> force vectorization (well, the name of the field already kind of suggest
> that), then IMHO we should treat it the same as -fvect-cost-model=unlimited
> for those loops.

Err - the user may have a specific sub-architecture in mind when using
#pragma simd, if you say we should completely ignore the cost model
then should we also sorry () if we cannot vectorize the loop (either
because of GCC deficiencies or lack of sub-target support)?

That said, at least in the cases that the cost model says the loop
is never profitable to vectorize we should follow its advice.

Richard.

> Thus (untested):
> 
> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* tree-vect-loop.c (vect_estimate_min_profitable_iters): Use
> 	unlimited cost model also for force_vect loops.
> 
> --- gcc/tree-vect-loop.c.jj	2013-11-12 12:09:40.000000000 +0100
> +++ gcc/tree-vect-loop.c	2013-11-12 15:11:43.821404330 +0100
> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
>  
>    /* Cost model disabled.  */
> -  if (unlimited_cost_model ())
> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
>      {
>        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
>        *ret_min_profitable_niters = 0;
> 
> 	Jakub
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-12 15:39                 ` Richard Biener
@ 2013-11-12 16:15                   ` Jakub Jelinek
  2013-11-12 18:59                   ` Sergey Ostanevich
  1 sibling, 0 replies; 44+ messages in thread
From: Jakub Jelinek @ 2013-11-12 16:15 UTC (permalink / raw)
  To: Richard Biener
  Cc: Sergey Ostanevich, Richard Henderson, Yuri Rumyantsev,
	gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

On Tue, Nov 12, 2013 at 03:35:13PM +0100, Richard Biener wrote:
> > Anyway, in that case I think the originally posted patch is wrong,
> > if we want to treat force_vect as disregard all the cost model and
> > force vectorization (well, the name of the field already kind of suggest
> > that), then IMHO we should treat it the same as -fvect-cost-model=unlimited
> > for those loops.
> 
> Err - the user may have a specific sub-architecture in mind when using
> #pragma simd, if you say we should completely ignore the cost model
> then should we also sorry () if we cannot vectorize the loop (either
> because of GCC deficiencies or lack of sub-target support)?

A sorry is too strong, AFAIK Cilk+ had for #pragma simd originally some
clause to request compiler error if it hasn't been vectorized, but it has
been removed afterwards, in OpenMP you can certainly write valid code which
will not be vectorizable ever.  Warning about it with a way to turn the
warning off, surely possible, but sorry/error is IMHO wrong.
In the spec the construct is just an optimization hint, now the question is
how strong a hint we should consider it.

> That said, at least in the cases that the cost model says the loop
> is never profitable to vectorize we should follow its advice.

But that is exactly the spot that Yuri was originally changing (to my
surprise not all the others cases).

Dunno if we want to let the user choose what they want,
-fsimd-vect-cost-model={unlimited,dynamic,cheap} which would override
-fvect-cost-model for simd loops, something different?

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-12 15:39                 ` Richard Biener
  2013-11-12 16:15                   ` Jakub Jelinek
@ 2013-11-12 18:59                   ` Sergey Ostanevich
  2013-11-13  9:59                     ` Richard Biener
  1 sibling, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-12 18:59 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

The reason patch was in its original state is because we want
to notify user that his assumption of profitability may be wrong.
This is not a part of any spec and as far as I know ICC does not
notify user about the case. Still it can be a good hint for those
users who tries to get as much as possible performance.

Richard's comment on the vectorization problems is about the same -
to inform user that his attempt to force vectorization is failed.

As for profitable or not - sometimes I believe it's impossible to be
precise. For OMP we have case of a vector version of a function
and we have no chance to figure out whether it is profitable to use
it or to loose it. If we can't map the loop for any vector length
other than 1 - I believe in this case we have to bail out and report.
Is it about 'never profitable'?


On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener <rguenther@suse.de> wrote:
> On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich wrote:
>>> ivdep just substitutes all cross-iteration data analysis,
>>> nothing related to cost model. ICC does not cancel its
>>> cost model in case of #pragma ivdep
>>>
>>> as for the safelen - OMP standart treats it as a limitation
>>> for the vector length. this means if no safelen is present
>>> an arbitrary vector length can be used.
>>
>> I was talking about GCC loop->safelen, which is INT_MAX for #pragma omp simd
>> without safelen clause or #pragma simd without vectorlength clause.
>>
>>> so I believe loop->force_vect is the only trigger to disregard
>>> the cost model
>>
>> Anyway, in that case I think the originally posted patch is wrong,
>> if we want to treat force_vect as disregard all the cost model and
>> force vectorization (well, the name of the field already kind of suggest
>> that), then IMHO we should treat it the same as -fvect-cost-model=unlimited
>> for those loops.
>
> Err - the user may have a specific sub-architecture in mind when using
> #pragma simd, if you say we should completely ignore the cost model
> then should we also sorry () if we cannot vectorize the loop (either
> because of GCC deficiencies or lack of sub-target support)?
>
> That said, at least in the cases that the cost model says the loop
> is never profitable to vectorize we should follow its advice.
>
> Richard.
>
>> Thus (untested):
>>
>> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
>>
>>       * tree-vect-loop.c (vect_estimate_min_profitable_iters): Use
>>       unlimited cost model also for force_vect loops.
>>
>> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000 +0100
>> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330 +0100
>> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
>>
>>    /* Cost model disabled.  */
>> -  if (unlimited_cost_model ())
>> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
>>      {
>>        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
>>        *ret_min_profitable_niters = 0;
>>
>>       Jakub
>>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-12 18:59                   ` Sergey Ostanevich
@ 2013-11-13  9:59                     ` Richard Biener
  2013-11-13 18:04                       ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Richard Biener @ 2013-11-13  9:59 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

On Tue, 12 Nov 2013, Sergey Ostanevich wrote:

> The reason patch was in its original state is because we want
> to notify user that his assumption of profitability may be wrong.
> This is not a part of any spec and as far as I know ICC does not
> notify user about the case. Still it can be a good hint for those
> users who tries to get as much as possible performance.
> 
> Richard's comment on the vectorization problems is about the same -
> to inform user that his attempt to force vectorization is failed.
> 
> As for profitable or not - sometimes I believe it's impossible to be
> precise. For OMP we have case of a vector version of a function
> and we have no chance to figure out whether it is profitable to use
> it or to loose it. If we can't map the loop for any vector length
> other than 1 - I believe in this case we have to bail out and report.
> Is it about 'never profitable'?

For example.  I think we should report non-vectorized loops
that are marked with force_vect anyway, with -Wdisabled-optimization.
Another case is that a loop may be profitable to vectorize if
the ISA supports a gather instruction but otherwise not.  Or if the
ISA supports efficient vector construction from N not loop 
invariant scalars (for vectorization of strided loads).

Simply disregarding all of the cost analysis sounds completely
bogus to me.

I'd simply go for the diagnostic for now, not changing anything else.
We want to have a good understanding about why the cost model is
so bad that we have to force to ignore it for #pragma simd - thus we
want testcases.

Richard.

> 
> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener <rguenther@suse.de> wrote:
> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich wrote:
> >>> ivdep just substitutes all cross-iteration data analysis,
> >>> nothing related to cost model. ICC does not cancel its
> >>> cost model in case of #pragma ivdep
> >>>
> >>> as for the safelen - OMP standart treats it as a limitation
> >>> for the vector length. this means if no safelen is present
> >>> an arbitrary vector length can be used.
> >>
> >> I was talking about GCC loop->safelen, which is INT_MAX for #pragma omp simd
> >> without safelen clause or #pragma simd without vectorlength clause.
> >>
> >>> so I believe loop->force_vect is the only trigger to disregard
> >>> the cost model
> >>
> >> Anyway, in that case I think the originally posted patch is wrong,
> >> if we want to treat force_vect as disregard all the cost model and
> >> force vectorization (well, the name of the field already kind of suggest
> >> that), then IMHO we should treat it the same as -fvect-cost-model=unlimited
> >> for those loops.
> >
> > Err - the user may have a specific sub-architecture in mind when using
> > #pragma simd, if you say we should completely ignore the cost model
> > then should we also sorry () if we cannot vectorize the loop (either
> > because of GCC deficiencies or lack of sub-target support)?
> >
> > That said, at least in the cases that the cost model says the loop
> > is never profitable to vectorize we should follow its advice.
> >
> > Richard.
> >
> >> Thus (untested):
> >>
> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
> >>
> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters): Use
> >>       unlimited cost model also for force_vect loops.
> >>
> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000 +0100
> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330 +0100
> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
> >>
> >>    /* Cost model disabled.  */
> >> -  if (unlimited_cost_model ())
> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> >>      {
> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
> >>        *ret_min_profitable_niters = 0;
> >>
> >>       Jakub
> >>
> >
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-13  9:59                     ` Richard Biener
@ 2013-11-13 18:04                       ` Sergey Ostanevich
  2013-11-14 10:16                         ` Richard Biener
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-13 18:04 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

I will get some tests.
As for cost analysis - simply consider the pragma as a request to
vectorize. How can I - as a developer - enforce it beyond the pragma?

On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>
>> The reason patch was in its original state is because we want
>> to notify user that his assumption of profitability may be wrong.
>> This is not a part of any spec and as far as I know ICC does not
>> notify user about the case. Still it can be a good hint for those
>> users who tries to get as much as possible performance.
>>
>> Richard's comment on the vectorization problems is about the same -
>> to inform user that his attempt to force vectorization is failed.
>>
>> As for profitable or not - sometimes I believe it's impossible to be
>> precise. For OMP we have case of a vector version of a function
>> and we have no chance to figure out whether it is profitable to use
>> it or to loose it. If we can't map the loop for any vector length
>> other than 1 - I believe in this case we have to bail out and report.
>> Is it about 'never profitable'?
>
> For example.  I think we should report non-vectorized loops
> that are marked with force_vect anyway, with -Wdisabled-optimization.
> Another case is that a loop may be profitable to vectorize if
> the ISA supports a gather instruction but otherwise not.  Or if the
> ISA supports efficient vector construction from N not loop
> invariant scalars (for vectorization of strided loads).
>
> Simply disregarding all of the cost analysis sounds completely
> bogus to me.
>
> I'd simply go for the diagnostic for now, not changing anything else.
> We want to have a good understanding about why the cost model is
> so bad that we have to force to ignore it for #pragma simd - thus we
> want testcases.
>
> Richard.
>
>>
>> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener <rguenther@suse.de> wrote:
>> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich wrote:
>> >>> ivdep just substitutes all cross-iteration data analysis,
>> >>> nothing related to cost model. ICC does not cancel its
>> >>> cost model in case of #pragma ivdep
>> >>>
>> >>> as for the safelen - OMP standart treats it as a limitation
>> >>> for the vector length. this means if no safelen is present
>> >>> an arbitrary vector length can be used.
>> >>
>> >> I was talking about GCC loop->safelen, which is INT_MAX for #pragma omp simd
>> >> without safelen clause or #pragma simd without vectorlength clause.
>> >>
>> >>> so I believe loop->force_vect is the only trigger to disregard
>> >>> the cost model
>> >>
>> >> Anyway, in that case I think the originally posted patch is wrong,
>> >> if we want to treat force_vect as disregard all the cost model and
>> >> force vectorization (well, the name of the field already kind of suggest
>> >> that), then IMHO we should treat it the same as -fvect-cost-model=unlimited
>> >> for those loops.
>> >
>> > Err - the user may have a specific sub-architecture in mind when using
>> > #pragma simd, if you say we should completely ignore the cost model
>> > then should we also sorry () if we cannot vectorize the loop (either
>> > because of GCC deficiencies or lack of sub-target support)?
>> >
>> > That said, at least in the cases that the cost model says the loop
>> > is never profitable to vectorize we should follow its advice.
>> >
>> > Richard.
>> >
>> >> Thus (untested):
>> >>
>> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
>> >>
>> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters): Use
>> >>       unlimited cost model also for force_vect loops.
>> >>
>> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000 +0100
>> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330 +0100
>> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
>> >>
>> >>    /* Cost model disabled.  */
>> >> -  if (unlimited_cost_model ())
>> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
>> >>      {
>> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
>> >>        *ret_min_profitable_niters = 0;
>> >>
>> >>       Jakub
>> >>
>> >
>>
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> GF: Jeff Hawn, Jennifer Guild, Felix Imend

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-13 18:04                       ` Sergey Ostanevich
@ 2013-11-14 10:16                         ` Richard Biener
  2013-11-14 20:51                           ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Richard Biener @ 2013-11-14 10:16 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

On Wed, 13 Nov 2013, Sergey Ostanevich wrote:

> I will get some tests.
> As for cost analysis - simply consider the pragma as a request to
> vectorize. How can I - as a developer - enforce it beyond the pragma?

You can disable the cost model via -fvect-cost-model=unlimited

Richard.

> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de> wrote:
> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
> >
> >> The reason patch was in its original state is because we want
> >> to notify user that his assumption of profitability may be wrong.
> >> This is not a part of any spec and as far as I know ICC does not
> >> notify user about the case. Still it can be a good hint for those
> >> users who tries to get as much as possible performance.
> >>
> >> Richard's comment on the vectorization problems is about the same -
> >> to inform user that his attempt to force vectorization is failed.
> >>
> >> As for profitable or not - sometimes I believe it's impossible to be
> >> precise. For OMP we have case of a vector version of a function
> >> and we have no chance to figure out whether it is profitable to use
> >> it or to loose it. If we can't map the loop for any vector length
> >> other than 1 - I believe in this case we have to bail out and report.
> >> Is it about 'never profitable'?
> >
> > For example.  I think we should report non-vectorized loops
> > that are marked with force_vect anyway, with -Wdisabled-optimization.
> > Another case is that a loop may be profitable to vectorize if
> > the ISA supports a gather instruction but otherwise not.  Or if the
> > ISA supports efficient vector construction from N not loop
> > invariant scalars (for vectorization of strided loads).
> >
> > Simply disregarding all of the cost analysis sounds completely
> > bogus to me.
> >
> > I'd simply go for the diagnostic for now, not changing anything else.
> > We want to have a good understanding about why the cost model is
> > so bad that we have to force to ignore it for #pragma simd - thus we
> > want testcases.
> >
> > Richard.
> >
> >>
> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener <rguenther@suse.de> wrote:
> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich wrote:
> >> >>> ivdep just substitutes all cross-iteration data analysis,
> >> >>> nothing related to cost model. ICC does not cancel its
> >> >>> cost model in case of #pragma ivdep
> >> >>>
> >> >>> as for the safelen - OMP standart treats it as a limitation
> >> >>> for the vector length. this means if no safelen is present
> >> >>> an arbitrary vector length can be used.
> >> >>
> >> >> I was talking about GCC loop->safelen, which is INT_MAX for #pragma omp simd
> >> >> without safelen clause or #pragma simd without vectorlength clause.
> >> >>
> >> >>> so I believe loop->force_vect is the only trigger to disregard
> >> >>> the cost model
> >> >>
> >> >> Anyway, in that case I think the originally posted patch is wrong,
> >> >> if we want to treat force_vect as disregard all the cost model and
> >> >> force vectorization (well, the name of the field already kind of suggest
> >> >> that), then IMHO we should treat it the same as -fvect-cost-model=unlimited
> >> >> for those loops.
> >> >
> >> > Err - the user may have a specific sub-architecture in mind when using
> >> > #pragma simd, if you say we should completely ignore the cost model
> >> > then should we also sorry () if we cannot vectorize the loop (either
> >> > because of GCC deficiencies or lack of sub-target support)?
> >> >
> >> > That said, at least in the cases that the cost model says the loop
> >> > is never profitable to vectorize we should follow its advice.
> >> >
> >> > Richard.
> >> >
> >> >> Thus (untested):
> >> >>
> >> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
> >> >>
> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters): Use
> >> >>       unlimited cost model also for force_vect loops.
> >> >>
> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000 +0100
> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330 +0100
> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
> >> >>
> >> >>    /* Cost model disabled.  */
> >> >> -  if (unlimited_cost_model ())
> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> >> >>      {
> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
> >> >>        *ret_min_profitable_niters = 0;
> >> >>
> >> >>       Jakub
> >> >>
> >> >
> >>
> >>
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE / SUSE Labs
> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-14 10:16                         ` Richard Biener
@ 2013-11-14 20:51                           ` Sergey Ostanevich
  2013-11-14 22:31                             ` Richard Biener
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-14 20:51 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

this is only for the whole file? I mean to have a particular loop
vectorized in a
file while all others - up to compiler's cost model. is there such a machinery?

Sergos

On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguenther@suse.de> wrote:
> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
>
>> I will get some tests.
>> As for cost analysis - simply consider the pragma as a request to
>> vectorize. How can I - as a developer - enforce it beyond the pragma?
>
> You can disable the cost model via -fvect-cost-model=unlimited
>
> Richard.
>
>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de> wrote:
>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>> >
>> >> The reason patch was in its original state is because we want
>> >> to notify user that his assumption of profitability may be wrong.
>> >> This is not a part of any spec and as far as I know ICC does not
>> >> notify user about the case. Still it can be a good hint for those
>> >> users who tries to get as much as possible performance.
>> >>
>> >> Richard's comment on the vectorization problems is about the same -
>> >> to inform user that his attempt to force vectorization is failed.
>> >>
>> >> As for profitable or not - sometimes I believe it's impossible to be
>> >> precise. For OMP we have case of a vector version of a function
>> >> and we have no chance to figure out whether it is profitable to use
>> >> it or to loose it. If we can't map the loop for any vector length
>> >> other than 1 - I believe in this case we have to bail out and report.
>> >> Is it about 'never profitable'?
>> >
>> > For example.  I think we should report non-vectorized loops
>> > that are marked with force_vect anyway, with -Wdisabled-optimization.
>> > Another case is that a loop may be profitable to vectorize if
>> > the ISA supports a gather instruction but otherwise not.  Or if the
>> > ISA supports efficient vector construction from N not loop
>> > invariant scalars (for vectorization of strided loads).
>> >
>> > Simply disregarding all of the cost analysis sounds completely
>> > bogus to me.
>> >
>> > I'd simply go for the diagnostic for now, not changing anything else.
>> > We want to have a good understanding about why the cost model is
>> > so bad that we have to force to ignore it for #pragma simd - thus we
>> > want testcases.
>> >
>> > Richard.
>> >
>> >>
>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener <rguenther@suse.de> wrote:
>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich wrote:
>> >> >>> ivdep just substitutes all cross-iteration data analysis,
>> >> >>> nothing related to cost model. ICC does not cancel its
>> >> >>> cost model in case of #pragma ivdep
>> >> >>>
>> >> >>> as for the safelen - OMP standart treats it as a limitation
>> >> >>> for the vector length. this means if no safelen is present
>> >> >>> an arbitrary vector length can be used.
>> >> >>
>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for #pragma omp simd
>> >> >> without safelen clause or #pragma simd without vectorlength clause.
>> >> >>
>> >> >>> so I believe loop->force_vect is the only trigger to disregard
>> >> >>> the cost model
>> >> >>
>> >> >> Anyway, in that case I think the originally posted patch is wrong,
>> >> >> if we want to treat force_vect as disregard all the cost model and
>> >> >> force vectorization (well, the name of the field already kind of suggest
>> >> >> that), then IMHO we should treat it the same as -fvect-cost-model=unlimited
>> >> >> for those loops.
>> >> >
>> >> > Err - the user may have a specific sub-architecture in mind when using
>> >> > #pragma simd, if you say we should completely ignore the cost model
>> >> > then should we also sorry () if we cannot vectorize the loop (either
>> >> > because of GCC deficiencies or lack of sub-target support)?
>> >> >
>> >> > That said, at least in the cases that the cost model says the loop
>> >> > is never profitable to vectorize we should follow its advice.
>> >> >
>> >> > Richard.
>> >> >
>> >> >> Thus (untested):
>> >> >>
>> >> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
>> >> >>
>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters): Use
>> >> >>       unlimited cost model also for force_vect loops.
>> >> >>
>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000 +0100
>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330 +0100
>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
>> >> >>
>> >> >>    /* Cost model disabled.  */
>> >> >> -  if (unlimited_cost_model ())
>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
>> >> >>      {
>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
>> >> >>        *ret_min_profitable_niters = 0;
>> >> >>
>> >> >>       Jakub
>> >> >>
>> >> >
>> >>
>> >>
>> >
>> > --
>> > Richard Biener <rguenther@suse.de>
>> > SUSE / SUSE Labs
>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
>>
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-14 20:51                           ` Sergey Ostanevich
@ 2013-11-14 22:31                             ` Richard Biener
  2013-11-15 14:25                               ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Richard Biener @ 2013-11-14 22:31 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

Sergey Ostanevich <sergos.gnu@gmail.com> wrote:
>this is only for the whole file? I mean to have a particular loop
>vectorized in a
>file while all others - up to compiler's cost model. is there such a
>machinery?

No, there is not.

Richard.

>Sergos
>
>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguenther@suse.de>
>wrote:
>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
>>
>>> I will get some tests.
>>> As for cost analysis - simply consider the pragma as a request to
>>> vectorize. How can I - as a developer - enforce it beyond the
>pragma?
>>
>> You can disable the cost model via -fvect-cost-model=unlimited
>>
>> Richard.
>>
>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de>
>wrote:
>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>>> >
>>> >> The reason patch was in its original state is because we want
>>> >> to notify user that his assumption of profitability may be wrong.
>>> >> This is not a part of any spec and as far as I know ICC does not
>>> >> notify user about the case. Still it can be a good hint for those
>>> >> users who tries to get as much as possible performance.
>>> >>
>>> >> Richard's comment on the vectorization problems is about the same
>-
>>> >> to inform user that his attempt to force vectorization is failed.
>>> >>
>>> >> As for profitable or not - sometimes I believe it's impossible to
>be
>>> >> precise. For OMP we have case of a vector version of a function
>>> >> and we have no chance to figure out whether it is profitable to
>use
>>> >> it or to loose it. If we can't map the loop for any vector length
>>> >> other than 1 - I believe in this case we have to bail out and
>report.
>>> >> Is it about 'never profitable'?
>>> >
>>> > For example.  I think we should report non-vectorized loops
>>> > that are marked with force_vect anyway, with
>-Wdisabled-optimization.
>>> > Another case is that a loop may be profitable to vectorize if
>>> > the ISA supports a gather instruction but otherwise not.  Or if
>the
>>> > ISA supports efficient vector construction from N not loop
>>> > invariant scalars (for vectorization of strided loads).
>>> >
>>> > Simply disregarding all of the cost analysis sounds completely
>>> > bogus to me.
>>> >
>>> > I'd simply go for the diagnostic for now, not changing anything
>else.
>>> > We want to have a good understanding about why the cost model is
>>> > so bad that we have to force to ignore it for #pragma simd - thus
>we
>>> > want testcases.
>>> >
>>> > Richard.
>>> >
>>> >>
>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
><rguenther@suse.de> wrote:
>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
>wrote:
>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
>>> >> >>> nothing related to cost model. ICC does not cancel its
>>> >> >>> cost model in case of #pragma ivdep
>>> >> >>>
>>> >> >>> as for the safelen - OMP standart treats it as a limitation
>>> >> >>> for the vector length. this means if no safelen is present
>>> >> >>> an arbitrary vector length can be used.
>>> >> >>
>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
>#pragma omp simd
>>> >> >> without safelen clause or #pragma simd without vectorlength
>clause.
>>> >> >>
>>> >> >>> so I believe loop->force_vect is the only trigger to
>disregard
>>> >> >>> the cost model
>>> >> >>
>>> >> >> Anyway, in that case I think the originally posted patch is
>wrong,
>>> >> >> if we want to treat force_vect as disregard all the cost model
>and
>>> >> >> force vectorization (well, the name of the field already kind
>of suggest
>>> >> >> that), then IMHO we should treat it the same as
>-fvect-cost-model=unlimited
>>> >> >> for those loops.
>>> >> >
>>> >> > Err - the user may have a specific sub-architecture in mind
>when using
>>> >> > #pragma simd, if you say we should completely ignore the cost
>model
>>> >> > then should we also sorry () if we cannot vectorize the loop
>(either
>>> >> > because of GCC deficiencies or lack of sub-target support)?
>>> >> >
>>> >> > That said, at least in the cases that the cost model says the
>loop
>>> >> > is never profitable to vectorize we should follow its advice.
>>> >> >
>>> >> > Richard.
>>> >> >
>>> >> >> Thus (untested):
>>> >> >>
>>> >> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
>>> >> >>
>>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters):
>Use
>>> >> >>       unlimited cost model also for force_vect loops.
>>> >> >>
>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000
>+0100
>>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330
>+0100
>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
>(loop_vinfo);
>>> >> >>
>>> >> >>    /* Cost model disabled.  */
>>> >> >> -  if (unlimited_cost_model ())
>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
>(loop_vinfo)->force_vect)
>>> >> >>      {
>>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model
>disabled.\n");
>>> >> >>        *ret_min_profitable_niters = 0;
>>> >> >>
>>> >> >>       Jakub
>>> >> >>
>>> >> >
>>> >>
>>> >>
>>> >
>>> > --
>>> > Richard Biener <rguenther@suse.de>
>>> > SUSE / SUSE Labs
>>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
>>>
>>>
>>
>> --
>> Richard Biener <rguenther@suse.de>
>> SUSE / SUSE Labs
>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-14 22:31                             ` Richard Biener
@ 2013-11-15 14:25                               ` Sergey Ostanevich
  2013-11-15 15:11                                 ` Jakub Jelinek
  2013-11-15 15:24                                 ` Richard Biener
  0 siblings, 2 replies; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-15 14:25 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

[-- Attachment #1: Type: text/plain, Size: 6542 bytes --]

Richard,

here's an example that causes trigger for the cost model. As soon as
elemental functions will appear and we update the vectorizer so it can accept
an elemental function inside the loop - we will have the same
situation as we have
it now: cost model will bail out with profitability estimation.
Still we have no chance to get info on how efficient the bar() function when it
is in vector form.

I believe I should repeat: #pragma omp simd is intended for introduction of an
instruction-level parallel region on developer's request, hence should
be treated
in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle
here, not a help.

Regards,
Sergos


On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener <rguenther@suse.de> wrote:
> Sergey Ostanevich <sergos.gnu@gmail.com> wrote:
>>this is only for the whole file? I mean to have a particular loop
>>vectorized in a
>>file while all others - up to compiler's cost model. is there such a
>>machinery?
>
> No, there is not.
>
> Richard.
>
>>Sergos
>>
>>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguenther@suse.de>
>>wrote:
>>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
>>>
>>>> I will get some tests.
>>>> As for cost analysis - simply consider the pragma as a request to
>>>> vectorize. How can I - as a developer - enforce it beyond the
>>pragma?
>>>
>>> You can disable the cost model via -fvect-cost-model=unlimited
>>>
>>> Richard.
>>>
>>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de>
>>wrote:
>>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>>>> >
>>>> >> The reason patch was in its original state is because we want
>>>> >> to notify user that his assumption of profitability may be wrong.
>>>> >> This is not a part of any spec and as far as I know ICC does not
>>>> >> notify user about the case. Still it can be a good hint for those
>>>> >> users who tries to get as much as possible performance.
>>>> >>
>>>> >> Richard's comment on the vectorization problems is about the same
>>-
>>>> >> to inform user that his attempt to force vectorization is failed.
>>>> >>
>>>> >> As for profitable or not - sometimes I believe it's impossible to
>>be
>>>> >> precise. For OMP we have case of a vector version of a function
>>>> >> and we have no chance to figure out whether it is profitable to
>>use
>>>> >> it or to loose it. If we can't map the loop for any vector length
>>>> >> other than 1 - I believe in this case we have to bail out and
>>report.
>>>> >> Is it about 'never profitable'?
>>>> >
>>>> > For example.  I think we should report non-vectorized loops
>>>> > that are marked with force_vect anyway, with
>>-Wdisabled-optimization.
>>>> > Another case is that a loop may be profitable to vectorize if
>>>> > the ISA supports a gather instruction but otherwise not.  Or if
>>the
>>>> > ISA supports efficient vector construction from N not loop
>>>> > invariant scalars (for vectorization of strided loads).
>>>> >
>>>> > Simply disregarding all of the cost analysis sounds completely
>>>> > bogus to me.
>>>> >
>>>> > I'd simply go for the diagnostic for now, not changing anything
>>else.
>>>> > We want to have a good understanding about why the cost model is
>>>> > so bad that we have to force to ignore it for #pragma simd - thus
>>we
>>>> > want testcases.
>>>> >
>>>> > Richard.
>>>> >
>>>> >>
>>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
>><rguenther@suse.de> wrote:
>>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
>>wrote:
>>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
>>>> >> >>> nothing related to cost model. ICC does not cancel its
>>>> >> >>> cost model in case of #pragma ivdep
>>>> >> >>>
>>>> >> >>> as for the safelen - OMP standart treats it as a limitation
>>>> >> >>> for the vector length. this means if no safelen is present
>>>> >> >>> an arbitrary vector length can be used.
>>>> >> >>
>>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
>>#pragma omp simd
>>>> >> >> without safelen clause or #pragma simd without vectorlength
>>clause.
>>>> >> >>
>>>> >> >>> so I believe loop->force_vect is the only trigger to
>>disregard
>>>> >> >>> the cost model
>>>> >> >>
>>>> >> >> Anyway, in that case I think the originally posted patch is
>>wrong,
>>>> >> >> if we want to treat force_vect as disregard all the cost model
>>and
>>>> >> >> force vectorization (well, the name of the field already kind
>>of suggest
>>>> >> >> that), then IMHO we should treat it the same as
>>-fvect-cost-model=unlimited
>>>> >> >> for those loops.
>>>> >> >
>>>> >> > Err - the user may have a specific sub-architecture in mind
>>when using
>>>> >> > #pragma simd, if you say we should completely ignore the cost
>>model
>>>> >> > then should we also sorry () if we cannot vectorize the loop
>>(either
>>>> >> > because of GCC deficiencies or lack of sub-target support)?
>>>> >> >
>>>> >> > That said, at least in the cases that the cost model says the
>>loop
>>>> >> > is never profitable to vectorize we should follow its advice.
>>>> >> >
>>>> >> > Richard.
>>>> >> >
>>>> >> >> Thus (untested):
>>>> >> >>
>>>> >> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
>>>> >> >>
>>>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters):
>>Use
>>>> >> >>       unlimited cost model also for force_vect loops.
>>>> >> >>
>>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000
>>+0100
>>>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330
>>+0100
>>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>>>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
>>(loop_vinfo);
>>>> >> >>
>>>> >> >>    /* Cost model disabled.  */
>>>> >> >> -  if (unlimited_cost_model ())
>>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
>>(loop_vinfo)->force_vect)
>>>> >> >>      {
>>>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model
>>disabled.\n");
>>>> >> >>        *ret_min_profitable_niters = 0;
>>>> >> >>
>>>> >> >>       Jakub
>>>> >> >>
>>>> >> >
>>>> >>
>>>> >>
>>>> >
>>>> > --
>>>> > Richard Biener <rguenther@suse.de>
>>>> > SUSE / SUSE Labs
>>>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>>>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
>>>>
>>>>
>>>
>>> --
>>> Richard Biener <rguenther@suse.de>
>>> SUSE / SUSE Labs
>>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>>> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>
>

[-- Attachment #2: t4.cpp --]
[-- Type: text/x-c++src, Size: 846 bytes --]

typedef float K[5];


struct Str1
{
  unsigned short u1, u2, u3; 
  int i1;             
  float f1, f2;      
  float f3;            
  K k1; 
};

struct Str2
{
  unsigned short u1, u2, u3; 
  int i1;             
  float f1, f2;      
  float f3;            
  float f4;
  float f5;
};


struct Str3
{
  float f1;
  unsigned char u1;
  union
  {
   K k1;
   struct Str1 *str1;
   struct Str2 *str2;
  } Un1;
};


struct str4
{
  int i1;
  short s1;
  char c1, u1;
  struct Str3 *str1;
};

#pragma omp declare simd 
extern float bar (float value);

float foo (struct str4 *Map)
{
  int i;
  float Value;
  float Total = 0.0;
#pragma omp simd
   for (i = 0; i < Map->s1; i++)
   {
     Value = Map->str1[i].f1;
//     Value = bar (Value);
     Total += Value;
   }
  return Total;
}


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-15 14:25                               ` Sergey Ostanevich
@ 2013-11-15 15:11                                 ` Jakub Jelinek
  2013-11-15 15:24                                 ` Richard Biener
  1 sibling, 0 replies; 44+ messages in thread
From: Jakub Jelinek @ 2013-11-15 15:11 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Richard Biener, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

On Fri, Nov 15, 2013 at 06:06:24PM +0400, Sergey Ostanevich wrote:
> here's an example that causes trigger for the cost model. As soon as
> elemental functions will appear and we update the vectorizer so it can accept
> an elemental function inside the loop - we will have the same
> situation as we have
> it now: cost model will bail out with profitability estimation.

Well, right now in the pending elemental patches there is no cost adjustment
in vectorize_simd_clone_call, which is wrong, because that effectively means
a call to the simd clone is considered as zero cost.  Perhaps using
the scalar cost of the call (which is considered just as one instruction
anyway), plus perhaps some bigger overhead for argument setup if needed
would be what should we use.  But of course we don't know the exact cost of
the scalar version of the function, nor vectorized one, and especially if it
is not in the current TU, we really can't know it.

I wonder if we shouldn't introduce
-fsimd-vect-cost-model=
where user could override the vect cost model for force_vect loops to
something else, be it -fsimd-vect-cost-model=unlimited to get what you are
asking, #pragma omp simd disregarding the cost model always, or
-fsimd-vect-cost-model=dynamic -fvect-cost-model=cheap etc.

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-15 14:25                               ` Sergey Ostanevich
  2013-11-15 15:11                                 ` Jakub Jelinek
@ 2013-11-15 15:24                                 ` Richard Biener
  2013-11-18 16:23                                   ` Sergey Ostanevich
  1 sibling, 1 reply; 44+ messages in thread
From: Richard Biener @ 2013-11-15 15:24 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

On Fri, 15 Nov 2013, Sergey Ostanevich wrote:

> Richard,
> 
> here's an example that causes trigger for the cost model.

I hardly believe that (AVX2)

.L9:
        vmovups (%rsi), %xmm3
        addl    $1, %r8d
        addq    $256, %rsi
        vinsertf128     $0x1, -240(%rsi), %ymm3, %ymm1
        vmovups -224(%rsi), %xmm3
        vinsertf128     $0x1, -208(%rsi), %ymm3, %ymm3
        vshufps $136, %ymm3, %ymm1, %ymm3
        vperm2f128      $3, %ymm3, %ymm3, %ymm2
        vshufps $68, %ymm2, %ymm3, %ymm1
        vshufps $238, %ymm2, %ymm3, %ymm2
        vmovups -192(%rsi), %xmm3
        vinsertf128     $1, %xmm2, %ymm1, %ymm2
        vinsertf128     $0x1, -176(%rsi), %ymm3, %ymm1
        vmovups -160(%rsi), %xmm3
        vinsertf128     $0x1, -144(%rsi), %ymm3, %ymm3
        vshufps $136, %ymm3, %ymm1, %ymm3
        vperm2f128      $3, %ymm3, %ymm3, %ymm1
        vshufps $68, %ymm1, %ymm3, %ymm4
        vshufps $238, %ymm1, %ymm3, %ymm1
        vmovups -128(%rsi), %xmm3
        vinsertf128     $1, %xmm1, %ymm4, %ymm1
        vshufps $136, %ymm1, %ymm2, %ymm1
        vperm2f128      $3, %ymm1, %ymm1, %ymm2
        vshufps $68, %ymm2, %ymm1, %ymm4
        vshufps $238, %ymm2, %ymm1, %ymm2
        vinsertf128     $0x1, -112(%rsi), %ymm3, %ymm1
        vmovups -96(%rsi), %xmm3
        vinsertf128     $1, %xmm2, %ymm4, %ymm4
        vinsertf128     $0x1, -80(%rsi), %ymm3, %ymm3
        vshufps $136, %ymm3, %ymm1, %ymm3
        vperm2f128      $3, %ymm3, %ymm3, %ymm2
        vshufps $68, %ymm2, %ymm3, %ymm1
        vshufps $238, %ymm2, %ymm3, %ymm2
        vmovups -64(%rsi), %xmm3
        vinsertf128     $1, %xmm2, %ymm1, %ymm2
        vinsertf128     $0x1, -48(%rsi), %ymm3, %ymm1
        vmovups -32(%rsi), %xmm3
        vinsertf128     $0x1, -16(%rsi), %ymm3, %ymm3
        cmpl    %r8d, %edi
        vshufps $136, %ymm3, %ymm1, %ymm3
        vperm2f128      $3, %ymm3, %ymm3, %ymm1
        vshufps $68, %ymm1, %ymm3, %ymm5
        vshufps $238, %ymm1, %ymm3, %ymm1
        vinsertf128     $1, %xmm1, %ymm5, %ymm1
        vshufps $136, %ymm1, %ymm2, %ymm1
        vperm2f128      $3, %ymm1, %ymm1, %ymm2
        vshufps $68, %ymm2, %ymm1, %ymm3
        vshufps $238, %ymm2, %ymm1, %ymm2
        vinsertf128     $1, %xmm2, %ymm3, %ymm1
        vshufps $136, %ymm1, %ymm4, %ymm1
        vperm2f128      $3, %ymm1, %ymm1, %ymm2
        vshufps $68, %ymm2, %ymm1, %ymm3
        vshufps $238, %ymm2, %ymm1, %ymm2
        vinsertf128     $1, %xmm2, %ymm3, %ymm2
        vaddps  %ymm2, %ymm0, %ymm0
        ja      .L9

is more efficient than

.L3:
        vaddss  (%rcx,%rax), %xmm0, %xmm0
        addq    $32, %rax
        cmpq    %rdx, %rax
        jne     .L3

;)

> As soon as
> elemental functions will appear and we update the vectorizer so it can accept
> an elemental function inside the loop - we will have the same
> situation as we have
> it now: cost model will bail out with profitability estimation.

Yes.

> Still we have no chance to get info on how efficient the bar() function when it
> is in vector form.

Well I assume you mean that the speedup when vectorizing the elemental
will offset whatever wreckage we cause with vectorizing the rest of the
statements.  I'd say you can at least compare to unrolling by
the vectorization factor, building the vector inputs to the elemental
from scalars, distributing the vector result from the elemental to
scalars.

> I believe I should repeat: #pragma omp simd is intended for introduction of an
> instruction-level parallel region on developer's request, hence should
> be treated
> in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle
> here, not a help.

Surely not if there isn't an elemental call in it.  With it the
cost model of course will have not enough information to decide.

But still, what's the difference to the case where we cannot vectorize
the function?  What happens if we cannot vectorize the elemental?
Do we have to build scalar versions for all possible vector sizes?

Richard.

> Regards,
> Sergos
> 
> 
> On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener <rguenther@suse.de> wrote:
> > Sergey Ostanevich <sergos.gnu@gmail.com> wrote:
> >>this is only for the whole file? I mean to have a particular loop
> >>vectorized in a
> >>file while all others - up to compiler's cost model. is there such a
> >>machinery?
> >
> > No, there is not.
> >
> > Richard.
> >
> >>Sergos
> >>
> >>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguenther@suse.de>
> >>wrote:
> >>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
> >>>
> >>>> I will get some tests.
> >>>> As for cost analysis - simply consider the pragma as a request to
> >>>> vectorize. How can I - as a developer - enforce it beyond the
> >>pragma?
> >>>
> >>> You can disable the cost model via -fvect-cost-model=unlimited
> >>>
> >>> Richard.
> >>>
> >>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de>
> >>wrote:
> >>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
> >>>> >
> >>>> >> The reason patch was in its original state is because we want
> >>>> >> to notify user that his assumption of profitability may be wrong.
> >>>> >> This is not a part of any spec and as far as I know ICC does not
> >>>> >> notify user about the case. Still it can be a good hint for those
> >>>> >> users who tries to get as much as possible performance.
> >>>> >>
> >>>> >> Richard's comment on the vectorization problems is about the same
> >>-
> >>>> >> to inform user that his attempt to force vectorization is failed.
> >>>> >>
> >>>> >> As for profitable or not - sometimes I believe it's impossible to
> >>be
> >>>> >> precise. For OMP we have case of a vector version of a function
> >>>> >> and we have no chance to figure out whether it is profitable to
> >>use
> >>>> >> it or to loose it. If we can't map the loop for any vector length
> >>>> >> other than 1 - I believe in this case we have to bail out and
> >>report.
> >>>> >> Is it about 'never profitable'?
> >>>> >
> >>>> > For example.  I think we should report non-vectorized loops
> >>>> > that are marked with force_vect anyway, with
> >>-Wdisabled-optimization.
> >>>> > Another case is that a loop may be profitable to vectorize if
> >>>> > the ISA supports a gather instruction but otherwise not.  Or if
> >>the
> >>>> > ISA supports efficient vector construction from N not loop
> >>>> > invariant scalars (for vectorization of strided loads).
> >>>> >
> >>>> > Simply disregarding all of the cost analysis sounds completely
> >>>> > bogus to me.
> >>>> >
> >>>> > I'd simply go for the diagnostic for now, not changing anything
> >>else.
> >>>> > We want to have a good understanding about why the cost model is
> >>>> > so bad that we have to force to ignore it for #pragma simd - thus
> >>we
> >>>> > want testcases.
> >>>> >
> >>>> > Richard.
> >>>> >
> >>>> >>
> >>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
> >><rguenther@suse.de> wrote:
> >>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
> >>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
> >>wrote:
> >>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
> >>>> >> >>> nothing related to cost model. ICC does not cancel its
> >>>> >> >>> cost model in case of #pragma ivdep
> >>>> >> >>>
> >>>> >> >>> as for the safelen - OMP standart treats it as a limitation
> >>>> >> >>> for the vector length. this means if no safelen is present
> >>>> >> >>> an arbitrary vector length can be used.
> >>>> >> >>
> >>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
> >>#pragma omp simd
> >>>> >> >> without safelen clause or #pragma simd without vectorlength
> >>clause.
> >>>> >> >>
> >>>> >> >>> so I believe loop->force_vect is the only trigger to
> >>disregard
> >>>> >> >>> the cost model
> >>>> >> >>
> >>>> >> >> Anyway, in that case I think the originally posted patch is
> >>wrong,
> >>>> >> >> if we want to treat force_vect as disregard all the cost model
> >>and
> >>>> >> >> force vectorization (well, the name of the field already kind
> >>of suggest
> >>>> >> >> that), then IMHO we should treat it the same as
> >>-fvect-cost-model=unlimited
> >>>> >> >> for those loops.
> >>>> >> >
> >>>> >> > Err - the user may have a specific sub-architecture in mind
> >>when using
> >>>> >> > #pragma simd, if you say we should completely ignore the cost
> >>model
> >>>> >> > then should we also sorry () if we cannot vectorize the loop
> >>(either
> >>>> >> > because of GCC deficiencies or lack of sub-target support)?
> >>>> >> >
> >>>> >> > That said, at least in the cases that the cost model says the
> >>loop
> >>>> >> > is never profitable to vectorize we should follow its advice.
> >>>> >> >
> >>>> >> > Richard.
> >>>> >> >
> >>>> >> >> Thus (untested):
> >>>> >> >>
> >>>> >> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
> >>>> >> >>
> >>>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters):
> >>Use
> >>>> >> >>       unlimited cost model also for force_vect loops.
> >>>> >> >>
> >>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000
> >>+0100
> >>>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330
> >>+0100
> >>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
> >>>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
> >>(loop_vinfo);
> >>>> >> >>
> >>>> >> >>    /* Cost model disabled.  */
> >>>> >> >> -  if (unlimited_cost_model ())
> >>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
> >>(loop_vinfo)->force_vect)
> >>>> >> >>      {
> >>>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model
> >>disabled.\n");
> >>>> >> >>        *ret_min_profitable_niters = 0;
> >>>> >> >>
> >>>> >> >>       Jakub
> >>>> >> >>
> >>>> >> >
> >>>> >>
> >>>> >>
> >>>> >
> >>>> > --
> >>>> > Richard Biener <rguenther@suse.de>
> >>>> > SUSE / SUSE Labs
> >>>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> >>>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
> >>>>
> >>>>
> >>>
> >>> --
> >>> Richard Biener <rguenther@suse.de>
> >>> SUSE / SUSE Labs
> >>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> >>> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
> >
> >
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-15 15:24                                 ` Richard Biener
@ 2013-11-18 16:23                                   ` Sergey Ostanevich
  2013-11-18 16:45                                     ` Richard Biener
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-18 16:23 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

I would agree that the example is just for the case cost model makes
correct estimation But how can we assure ourself that it won't have any
mistakes in the future?

I believe it'll be Ok to introduce an extra flag as Jakub proposed for the
dedicated simd-forced vectorization to use unlimited cost model. This
can be default for -fopenmp or there should be a warning issued that
compiler overrides user's request of vectorization. In such a case user
can enforce vectorization (even with mentioned results :) with this
unlimited cost model for simd.



On Fri, Nov 15, 2013 at 6:24 PM, Richard Biener <rguenther@suse.de> wrote:
> On Fri, 15 Nov 2013, Sergey Ostanevich wrote:
>
>> Richard,
>>
>> here's an example that causes trigger for the cost model.
>
> I hardly believe that (AVX2)
>
> .L9:
>         vmovups (%rsi), %xmm3
>         addl    $1, %r8d
>         addq    $256, %rsi
>         vinsertf128     $0x1, -240(%rsi), %ymm3, %ymm1
>         vmovups -224(%rsi), %xmm3
>         vinsertf128     $0x1, -208(%rsi), %ymm3, %ymm3
>         vshufps $136, %ymm3, %ymm1, %ymm3
>         vperm2f128      $3, %ymm3, %ymm3, %ymm2
>         vshufps $68, %ymm2, %ymm3, %ymm1
>         vshufps $238, %ymm2, %ymm3, %ymm2
>         vmovups -192(%rsi), %xmm3
>         vinsertf128     $1, %xmm2, %ymm1, %ymm2
>         vinsertf128     $0x1, -176(%rsi), %ymm3, %ymm1
>         vmovups -160(%rsi), %xmm3
>         vinsertf128     $0x1, -144(%rsi), %ymm3, %ymm3
>         vshufps $136, %ymm3, %ymm1, %ymm3
>         vperm2f128      $3, %ymm3, %ymm3, %ymm1
>         vshufps $68, %ymm1, %ymm3, %ymm4
>         vshufps $238, %ymm1, %ymm3, %ymm1
>         vmovups -128(%rsi), %xmm3
>         vinsertf128     $1, %xmm1, %ymm4, %ymm1
>         vshufps $136, %ymm1, %ymm2, %ymm1
>         vperm2f128      $3, %ymm1, %ymm1, %ymm2
>         vshufps $68, %ymm2, %ymm1, %ymm4
>         vshufps $238, %ymm2, %ymm1, %ymm2
>         vinsertf128     $0x1, -112(%rsi), %ymm3, %ymm1
>         vmovups -96(%rsi), %xmm3
>         vinsertf128     $1, %xmm2, %ymm4, %ymm4
>         vinsertf128     $0x1, -80(%rsi), %ymm3, %ymm3
>         vshufps $136, %ymm3, %ymm1, %ymm3
>         vperm2f128      $3, %ymm3, %ymm3, %ymm2
>         vshufps $68, %ymm2, %ymm3, %ymm1
>         vshufps $238, %ymm2, %ymm3, %ymm2
>         vmovups -64(%rsi), %xmm3
>         vinsertf128     $1, %xmm2, %ymm1, %ymm2
>         vinsertf128     $0x1, -48(%rsi), %ymm3, %ymm1
>         vmovups -32(%rsi), %xmm3
>         vinsertf128     $0x1, -16(%rsi), %ymm3, %ymm3
>         cmpl    %r8d, %edi
>         vshufps $136, %ymm3, %ymm1, %ymm3
>         vperm2f128      $3, %ymm3, %ymm3, %ymm1
>         vshufps $68, %ymm1, %ymm3, %ymm5
>         vshufps $238, %ymm1, %ymm3, %ymm1
>         vinsertf128     $1, %xmm1, %ymm5, %ymm1
>         vshufps $136, %ymm1, %ymm2, %ymm1
>         vperm2f128      $3, %ymm1, %ymm1, %ymm2
>         vshufps $68, %ymm2, %ymm1, %ymm3
>         vshufps $238, %ymm2, %ymm1, %ymm2
>         vinsertf128     $1, %xmm2, %ymm3, %ymm1
>         vshufps $136, %ymm1, %ymm4, %ymm1
>         vperm2f128      $3, %ymm1, %ymm1, %ymm2
>         vshufps $68, %ymm2, %ymm1, %ymm3
>         vshufps $238, %ymm2, %ymm1, %ymm2
>         vinsertf128     $1, %xmm2, %ymm3, %ymm2
>         vaddps  %ymm2, %ymm0, %ymm0
>         ja      .L9
>
> is more efficient than
>
> .L3:
>         vaddss  (%rcx,%rax), %xmm0, %xmm0
>         addq    $32, %rax
>         cmpq    %rdx, %rax
>         jne     .L3
>
> ;)
>
>> As soon as
>> elemental functions will appear and we update the vectorizer so it can accept
>> an elemental function inside the loop - we will have the same
>> situation as we have
>> it now: cost model will bail out with profitability estimation.
>
> Yes.
>
>> Still we have no chance to get info on how efficient the bar() function when it
>> is in vector form.
>
> Well I assume you mean that the speedup when vectorizing the elemental
> will offset whatever wreckage we cause with vectorizing the rest of the
> statements.  I'd say you can at least compare to unrolling by
> the vectorization factor, building the vector inputs to the elemental
> from scalars, distributing the vector result from the elemental to
> scalars.
>
>> I believe I should repeat: #pragma omp simd is intended for introduction of an
>> instruction-level parallel region on developer's request, hence should
>> be treated
>> in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle
>> here, not a help.
>
> Surely not if there isn't an elemental call in it.  With it the
> cost model of course will have not enough information to decide.
>
> But still, what's the difference to the case where we cannot vectorize
> the function?  What happens if we cannot vectorize the elemental?
> Do we have to build scalar versions for all possible vector sizes?
>
> Richard.
>
>> Regards,
>> Sergos
>>
>>
>> On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener <rguenther@suse.de> wrote:
>> > Sergey Ostanevich <sergos.gnu@gmail.com> wrote:
>> >>this is only for the whole file? I mean to have a particular loop
>> >>vectorized in a
>> >>file while all others - up to compiler's cost model. is there such a
>> >>machinery?
>> >
>> > No, there is not.
>> >
>> > Richard.
>> >
>> >>Sergos
>> >>
>> >>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguenther@suse.de>
>> >>wrote:
>> >>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
>> >>>
>> >>>> I will get some tests.
>> >>>> As for cost analysis - simply consider the pragma as a request to
>> >>>> vectorize. How can I - as a developer - enforce it beyond the
>> >>pragma?
>> >>>
>> >>> You can disable the cost model via -fvect-cost-model=unlimited
>> >>>
>> >>> Richard.
>> >>>
>> >>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de>
>> >>wrote:
>> >>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>> >>>> >
>> >>>> >> The reason patch was in its original state is because we want
>> >>>> >> to notify user that his assumption of profitability may be wrong.
>> >>>> >> This is not a part of any spec and as far as I know ICC does not
>> >>>> >> notify user about the case. Still it can be a good hint for those
>> >>>> >> users who tries to get as much as possible performance.
>> >>>> >>
>> >>>> >> Richard's comment on the vectorization problems is about the same
>> >>-
>> >>>> >> to inform user that his attempt to force vectorization is failed.
>> >>>> >>
>> >>>> >> As for profitable or not - sometimes I believe it's impossible to
>> >>be
>> >>>> >> precise. For OMP we have case of a vector version of a function
>> >>>> >> and we have no chance to figure out whether it is profitable to
>> >>use
>> >>>> >> it or to loose it. If we can't map the loop for any vector length
>> >>>> >> other than 1 - I believe in this case we have to bail out and
>> >>report.
>> >>>> >> Is it about 'never profitable'?
>> >>>> >
>> >>>> > For example.  I think we should report non-vectorized loops
>> >>>> > that are marked with force_vect anyway, with
>> >>-Wdisabled-optimization.
>> >>>> > Another case is that a loop may be profitable to vectorize if
>> >>>> > the ISA supports a gather instruction but otherwise not.  Or if
>> >>the
>> >>>> > ISA supports efficient vector construction from N not loop
>> >>>> > invariant scalars (for vectorization of strided loads).
>> >>>> >
>> >>>> > Simply disregarding all of the cost analysis sounds completely
>> >>>> > bogus to me.
>> >>>> >
>> >>>> > I'd simply go for the diagnostic for now, not changing anything
>> >>else.
>> >>>> > We want to have a good understanding about why the cost model is
>> >>>> > so bad that we have to force to ignore it for #pragma simd - thus
>> >>we
>> >>>> > want testcases.
>> >>>> >
>> >>>> > Richard.
>> >>>> >
>> >>>> >>
>> >>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
>> >><rguenther@suse.de> wrote:
>> >>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>> >>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
>> >>wrote:
>> >>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
>> >>>> >> >>> nothing related to cost model. ICC does not cancel its
>> >>>> >> >>> cost model in case of #pragma ivdep
>> >>>> >> >>>
>> >>>> >> >>> as for the safelen - OMP standart treats it as a limitation
>> >>>> >> >>> for the vector length. this means if no safelen is present
>> >>>> >> >>> an arbitrary vector length can be used.
>> >>>> >> >>
>> >>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
>> >>#pragma omp simd
>> >>>> >> >> without safelen clause or #pragma simd without vectorlength
>> >>clause.
>> >>>> >> >>
>> >>>> >> >>> so I believe loop->force_vect is the only trigger to
>> >>disregard
>> >>>> >> >>> the cost model
>> >>>> >> >>
>> >>>> >> >> Anyway, in that case I think the originally posted patch is
>> >>wrong,
>> >>>> >> >> if we want to treat force_vect as disregard all the cost model
>> >>and
>> >>>> >> >> force vectorization (well, the name of the field already kind
>> >>of suggest
>> >>>> >> >> that), then IMHO we should treat it the same as
>> >>-fvect-cost-model=unlimited
>> >>>> >> >> for those loops.
>> >>>> >> >
>> >>>> >> > Err - the user may have a specific sub-architecture in mind
>> >>when using
>> >>>> >> > #pragma simd, if you say we should completely ignore the cost
>> >>model
>> >>>> >> > then should we also sorry () if we cannot vectorize the loop
>> >>(either
>> >>>> >> > because of GCC deficiencies or lack of sub-target support)?
>> >>>> >> >
>> >>>> >> > That said, at least in the cases that the cost model says the
>> >>loop
>> >>>> >> > is never profitable to vectorize we should follow its advice.
>> >>>> >> >
>> >>>> >> > Richard.
>> >>>> >> >
>> >>>> >> >> Thus (untested):
>> >>>> >> >>
>> >>>> >> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
>> >>>> >> >>
>> >>>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters):
>> >>Use
>> >>>> >> >>       unlimited cost model also for force_vect loops.
>> >>>> >> >>
>> >>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000
>> >>+0100
>> >>>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330
>> >>+0100
>> >>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>> >>>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
>> >>(loop_vinfo);
>> >>>> >> >>
>> >>>> >> >>    /* Cost model disabled.  */
>> >>>> >> >> -  if (unlimited_cost_model ())
>> >>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
>> >>(loop_vinfo)->force_vect)
>> >>>> >> >>      {
>> >>>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model
>> >>disabled.\n");
>> >>>> >> >>        *ret_min_profitable_niters = 0;
>> >>>> >> >>
>> >>>> >> >>       Jakub
>> >>>> >> >>
>> >>>> >> >
>> >>>> >>
>> >>>> >>
>> >>>> >
>> >>>> > --
>> >>>> > Richard Biener <rguenther@suse.de>
>> >>>> > SUSE / SUSE Labs
>> >>>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> >>>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
>> >>>>
>> >>>>
>> >>>
>> >>> --
>> >>> Richard Biener <rguenther@suse.de>
>> >>> SUSE / SUSE Labs
>> >>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> >>> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>> >
>> >
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-18 16:23                                   ` Sergey Ostanevich
@ 2013-11-18 16:45                                     ` Richard Biener
  2013-11-19 14:48                                       ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Richard Biener @ 2013-11-18 16:45 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

On Mon, 18 Nov 2013, Sergey Ostanevich wrote:

> I would agree that the example is just for the case cost model makes
> correct estimation But how can we assure ourself that it won't have any
> mistakes in the future?

We call it bugs and not mistakes and we have bugzilla for it.

Richard.

> I believe it'll be Ok to introduce an extra flag as Jakub proposed for the
> dedicated simd-forced vectorization to use unlimited cost model. This
> can be default for -fopenmp or there should be a warning issued that
> compiler overrides user's request of vectorization. In such a case user
> can enforce vectorization (even with mentioned results :) with this
> unlimited cost model for simd.
> 
> 
> 
> On Fri, Nov 15, 2013 at 6:24 PM, Richard Biener <rguenther@suse.de> wrote:
> > On Fri, 15 Nov 2013, Sergey Ostanevich wrote:
> >
> >> Richard,
> >>
> >> here's an example that causes trigger for the cost model.
> >
> > I hardly believe that (AVX2)
> >
> > .L9:
> >         vmovups (%rsi), %xmm3
> >         addl    $1, %r8d
> >         addq    $256, %rsi
> >         vinsertf128     $0x1, -240(%rsi), %ymm3, %ymm1
> >         vmovups -224(%rsi), %xmm3
> >         vinsertf128     $0x1, -208(%rsi), %ymm3, %ymm3
> >         vshufps $136, %ymm3, %ymm1, %ymm3
> >         vperm2f128      $3, %ymm3, %ymm3, %ymm2
> >         vshufps $68, %ymm2, %ymm3, %ymm1
> >         vshufps $238, %ymm2, %ymm3, %ymm2
> >         vmovups -192(%rsi), %xmm3
> >         vinsertf128     $1, %xmm2, %ymm1, %ymm2
> >         vinsertf128     $0x1, -176(%rsi), %ymm3, %ymm1
> >         vmovups -160(%rsi), %xmm3
> >         vinsertf128     $0x1, -144(%rsi), %ymm3, %ymm3
> >         vshufps $136, %ymm3, %ymm1, %ymm3
> >         vperm2f128      $3, %ymm3, %ymm3, %ymm1
> >         vshufps $68, %ymm1, %ymm3, %ymm4
> >         vshufps $238, %ymm1, %ymm3, %ymm1
> >         vmovups -128(%rsi), %xmm3
> >         vinsertf128     $1, %xmm1, %ymm4, %ymm1
> >         vshufps $136, %ymm1, %ymm2, %ymm1
> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
> >         vshufps $68, %ymm2, %ymm1, %ymm4
> >         vshufps $238, %ymm2, %ymm1, %ymm2
> >         vinsertf128     $0x1, -112(%rsi), %ymm3, %ymm1
> >         vmovups -96(%rsi), %xmm3
> >         vinsertf128     $1, %xmm2, %ymm4, %ymm4
> >         vinsertf128     $0x1, -80(%rsi), %ymm3, %ymm3
> >         vshufps $136, %ymm3, %ymm1, %ymm3
> >         vperm2f128      $3, %ymm3, %ymm3, %ymm2
> >         vshufps $68, %ymm2, %ymm3, %ymm1
> >         vshufps $238, %ymm2, %ymm3, %ymm2
> >         vmovups -64(%rsi), %xmm3
> >         vinsertf128     $1, %xmm2, %ymm1, %ymm2
> >         vinsertf128     $0x1, -48(%rsi), %ymm3, %ymm1
> >         vmovups -32(%rsi), %xmm3
> >         vinsertf128     $0x1, -16(%rsi), %ymm3, %ymm3
> >         cmpl    %r8d, %edi
> >         vshufps $136, %ymm3, %ymm1, %ymm3
> >         vperm2f128      $3, %ymm3, %ymm3, %ymm1
> >         vshufps $68, %ymm1, %ymm3, %ymm5
> >         vshufps $238, %ymm1, %ymm3, %ymm1
> >         vinsertf128     $1, %xmm1, %ymm5, %ymm1
> >         vshufps $136, %ymm1, %ymm2, %ymm1
> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
> >         vshufps $68, %ymm2, %ymm1, %ymm3
> >         vshufps $238, %ymm2, %ymm1, %ymm2
> >         vinsertf128     $1, %xmm2, %ymm3, %ymm1
> >         vshufps $136, %ymm1, %ymm4, %ymm1
> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
> >         vshufps $68, %ymm2, %ymm1, %ymm3
> >         vshufps $238, %ymm2, %ymm1, %ymm2
> >         vinsertf128     $1, %xmm2, %ymm3, %ymm2
> >         vaddps  %ymm2, %ymm0, %ymm0
> >         ja      .L9
> >
> > is more efficient than
> >
> > .L3:
> >         vaddss  (%rcx,%rax), %xmm0, %xmm0
> >         addq    $32, %rax
> >         cmpq    %rdx, %rax
> >         jne     .L3
> >
> > ;)
> >
> >> As soon as
> >> elemental functions will appear and we update the vectorizer so it can accept
> >> an elemental function inside the loop - we will have the same
> >> situation as we have
> >> it now: cost model will bail out with profitability estimation.
> >
> > Yes.
> >
> >> Still we have no chance to get info on how efficient the bar() function when it
> >> is in vector form.
> >
> > Well I assume you mean that the speedup when vectorizing the elemental
> > will offset whatever wreckage we cause with vectorizing the rest of the
> > statements.  I'd say you can at least compare to unrolling by
> > the vectorization factor, building the vector inputs to the elemental
> > from scalars, distributing the vector result from the elemental to
> > scalars.
> >
> >> I believe I should repeat: #pragma omp simd is intended for introduction of an
> >> instruction-level parallel region on developer's request, hence should
> >> be treated
> >> in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle
> >> here, not a help.
> >
> > Surely not if there isn't an elemental call in it.  With it the
> > cost model of course will have not enough information to decide.
> >
> > But still, what's the difference to the case where we cannot vectorize
> > the function?  What happens if we cannot vectorize the elemental?
> > Do we have to build scalar versions for all possible vector sizes?
> >
> > Richard.
> >
> >> Regards,
> >> Sergos
> >>
> >>
> >> On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener <rguenther@suse.de> wrote:
> >> > Sergey Ostanevich <sergos.gnu@gmail.com> wrote:
> >> >>this is only for the whole file? I mean to have a particular loop
> >> >>vectorized in a
> >> >>file while all others - up to compiler's cost model. is there such a
> >> >>machinery?
> >> >
> >> > No, there is not.
> >> >
> >> > Richard.
> >> >
> >> >>Sergos
> >> >>
> >> >>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguenther@suse.de>
> >> >>wrote:
> >> >>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
> >> >>>
> >> >>>> I will get some tests.
> >> >>>> As for cost analysis - simply consider the pragma as a request to
> >> >>>> vectorize. How can I - as a developer - enforce it beyond the
> >> >>pragma?
> >> >>>
> >> >>> You can disable the cost model via -fvect-cost-model=unlimited
> >> >>>
> >> >>> Richard.
> >> >>>
> >> >>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de>
> >> >>wrote:
> >> >>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
> >> >>>> >
> >> >>>> >> The reason patch was in its original state is because we want
> >> >>>> >> to notify user that his assumption of profitability may be wrong.
> >> >>>> >> This is not a part of any spec and as far as I know ICC does not
> >> >>>> >> notify user about the case. Still it can be a good hint for those
> >> >>>> >> users who tries to get as much as possible performance.
> >> >>>> >>
> >> >>>> >> Richard's comment on the vectorization problems is about the same
> >> >>-
> >> >>>> >> to inform user that his attempt to force vectorization is failed.
> >> >>>> >>
> >> >>>> >> As for profitable or not - sometimes I believe it's impossible to
> >> >>be
> >> >>>> >> precise. For OMP we have case of a vector version of a function
> >> >>>> >> and we have no chance to figure out whether it is profitable to
> >> >>use
> >> >>>> >> it or to loose it. If we can't map the loop for any vector length
> >> >>>> >> other than 1 - I believe in this case we have to bail out and
> >> >>report.
> >> >>>> >> Is it about 'never profitable'?
> >> >>>> >
> >> >>>> > For example.  I think we should report non-vectorized loops
> >> >>>> > that are marked with force_vect anyway, with
> >> >>-Wdisabled-optimization.
> >> >>>> > Another case is that a loop may be profitable to vectorize if
> >> >>>> > the ISA supports a gather instruction but otherwise not.  Or if
> >> >>the
> >> >>>> > ISA supports efficient vector construction from N not loop
> >> >>>> > invariant scalars (for vectorization of strided loads).
> >> >>>> >
> >> >>>> > Simply disregarding all of the cost analysis sounds completely
> >> >>>> > bogus to me.
> >> >>>> >
> >> >>>> > I'd simply go for the diagnostic for now, not changing anything
> >> >>else.
> >> >>>> > We want to have a good understanding about why the cost model is
> >> >>>> > so bad that we have to force to ignore it for #pragma simd - thus
> >> >>we
> >> >>>> > want testcases.
> >> >>>> >
> >> >>>> > Richard.
> >> >>>> >
> >> >>>> >>
> >> >>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
> >> >><rguenther@suse.de> wrote:
> >> >>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
> >> >>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
> >> >>wrote:
> >> >>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
> >> >>>> >> >>> nothing related to cost model. ICC does not cancel its
> >> >>>> >> >>> cost model in case of #pragma ivdep
> >> >>>> >> >>>
> >> >>>> >> >>> as for the safelen - OMP standart treats it as a limitation
> >> >>>> >> >>> for the vector length. this means if no safelen is present
> >> >>>> >> >>> an arbitrary vector length can be used.
> >> >>>> >> >>
> >> >>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
> >> >>#pragma omp simd
> >> >>>> >> >> without safelen clause or #pragma simd without vectorlength
> >> >>clause.
> >> >>>> >> >>
> >> >>>> >> >>> so I believe loop->force_vect is the only trigger to
> >> >>disregard
> >> >>>> >> >>> the cost model
> >> >>>> >> >>
> >> >>>> >> >> Anyway, in that case I think the originally posted patch is
> >> >>wrong,
> >> >>>> >> >> if we want to treat force_vect as disregard all the cost model
> >> >>and
> >> >>>> >> >> force vectorization (well, the name of the field already kind
> >> >>of suggest
> >> >>>> >> >> that), then IMHO we should treat it the same as
> >> >>-fvect-cost-model=unlimited
> >> >>>> >> >> for those loops.
> >> >>>> >> >
> >> >>>> >> > Err - the user may have a specific sub-architecture in mind
> >> >>when using
> >> >>>> >> > #pragma simd, if you say we should completely ignore the cost
> >> >>model
> >> >>>> >> > then should we also sorry () if we cannot vectorize the loop
> >> >>(either
> >> >>>> >> > because of GCC deficiencies or lack of sub-target support)?
> >> >>>> >> >
> >> >>>> >> > That said, at least in the cases that the cost model says the
> >> >>loop
> >> >>>> >> > is never profitable to vectorize we should follow its advice.
> >> >>>> >> >
> >> >>>> >> > Richard.
> >> >>>> >> >
> >> >>>> >> >> Thus (untested):
> >> >>>> >> >>
> >> >>>> >> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
> >> >>>> >> >>
> >> >>>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters):
> >> >>Use
> >> >>>> >> >>       unlimited cost model also for force_vect loops.
> >> >>>> >> >>
> >> >>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000
> >> >>+0100
> >> >>>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330
> >> >>+0100
> >> >>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
> >> >>>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
> >> >>(loop_vinfo);
> >> >>>> >> >>
> >> >>>> >> >>    /* Cost model disabled.  */
> >> >>>> >> >> -  if (unlimited_cost_model ())
> >> >>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
> >> >>(loop_vinfo)->force_vect)
> >> >>>> >> >>      {
> >> >>>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model
> >> >>disabled.\n");
> >> >>>> >> >>        *ret_min_profitable_niters = 0;
> >> >>>> >> >>
> >> >>>> >> >>       Jakub
> >> >>>> >> >>
> >> >>>> >> >
> >> >>>> >>
> >> >>>> >>
> >> >>>> >
> >> >>>> > --
> >> >>>> > Richard Biener <rguenther@suse.de>
> >> >>>> > SUSE / SUSE Labs
> >> >>>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> >> >>>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
> >> >>>>
> >> >>>>
> >> >>>
> >> >>> --
> >> >>> Richard Biener <rguenther@suse.de>
> >> >>> SUSE / SUSE Labs
> >> >>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> >> >>> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
> >> >
> >> >
> >>
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE / SUSE Labs
> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-18 16:45                                     ` Richard Biener
@ 2013-11-19 14:48                                       ` Sergey Ostanevich
  2013-11-19 14:57                                         ` Richard Biener
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-19 14:48 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

:) agree to you, but as soon as you're a user who tries to introduce
vector code and face a bug in cost model you'd like to have a
workaround until the bug will be fixed and compiler will come to you
with new OS distribution, don't you?

I propose the following, yet SLP have to use a NULL as a loop info
which looks somewhat hacky.

Sergos


        * common.opt: Added new option -fsimd-vect-cost-model
        * tree-vectorizer.h (unlimited_cost_model): Interface update
        to rely on particular loop info
        * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
        unlimited_cost_model call according to new interface
        (vect_peeling_hash_choose_best_peeling): Ditto
        (vect_enhance_data_refs_alignment): Ditto
        * tree-vect-slp.c: Ditto
        * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
        plus issue a warning in case cost model overrides users' directive



diff --git a/gcc/common.opt b/gcc/common.opt
index d5971df..87b3b37 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2296,6 +2296,10 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization

+fsimd-vect-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_simd_vect_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the cost model for vectorization in loops marked with
#pragma omp simd
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
vectorizer cost model %qs)

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 83d1f45..e26f704 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
loop_vinfo, struct data_reference *dr,
       *new_slot = slot;
     }

-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     slot->count += VECT_MAX_COST;
 }

@@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
(loop_vec_info loop_vinfo,
    res.peel_info.dr = NULL;
    res.body_cost_vec = stmt_vector_for_cost ();

-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
      {
        res.inside_cost = INT_MAX;
        res.outside_cost = INT_MAX;
@@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
                  vectorization factor.
                  We do this automtically for cost model, since we
calculate cost
                  for every peeling option.  */
-              if (unlimited_cost_model ())
+              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                 possible_npeel_number = vf /nelements;

               /* Handle the aligned case. We may decide to align some other
@@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
               if (DR_MISALIGNMENT (dr) == 0)
                 {
                   npeel_tmp = 0;
-                  if (unlimited_cost_model ())
+                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                     possible_npeel_number++;
                 }

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 86ebbd2..be66172 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters
(loop_vec_info loop_vinfo,
   void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);

   /* Cost model disabled.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     {
       dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
       *ret_min_profitable_niters = 0;
@@ -2929,6 +2929,11 @@ vect_estimate_min_profitable_iters
(loop_vec_info loop_vinfo,
   /* vector version will never be profitable.  */
   else
     {
+      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
+        {
+          pedwarn (vect_location, 0, "Vectorization did not happen
for the loop");
+        }
+
       if (dump_enabled_p ())
         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
  "cost model: the vector iteration cost = %d "
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 247bdfd..4b25964 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2171,7 +2171,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
     }

   /* Cost model: check if the vectorization is worthwhile.  */
-  if (!unlimited_cost_model ()
+  if (!unlimited_cost_model (NULL)
       && !vect_bb_vectorization_profitable_p (bb_vinfo))
     {
       if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index a6c5b59..2916906 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -919,9 +919,12 @@ known_alignment_for_access_p (struct
data_reference *data_ref_info)

 /* Return true if the vect cost model is unlimited.  */
 static inline bool
-unlimited_cost_model ()
+unlimited_cost_model (loop_p loop)
 {
-  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
+  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED
+          || (loop != NULL
+              && loop->force_vect
+              && flag_simd_vect_cost_model == VECT_COST_MODEL_UNLIMITED));
 }

 /* Source location */

On Mon, Nov 18, 2013 at 7:13 PM, Richard Biener <rguenther@suse.de> wrote:
> On Mon, 18 Nov 2013, Sergey Ostanevich wrote:
>
>> I would agree that the example is just for the case cost model makes
>> correct estimation But how can we assure ourself that it won't have any
>> mistakes in the future?
>
> We call it bugs and not mistakes and we have bugzilla for it.
>
> Richard.
>
>> I believe it'll be Ok to introduce an extra flag as Jakub proposed for the
>> dedicated simd-forced vectorization to use unlimited cost model. This
>> can be default for -fopenmp or there should be a warning issued that
>> compiler overrides user's request of vectorization. In such a case user
>> can enforce vectorization (even with mentioned results :) with this
>> unlimited cost model for simd.
>>
>>
>>
>> On Fri, Nov 15, 2013 at 6:24 PM, Richard Biener <rguenther@suse.de> wrote:
>> > On Fri, 15 Nov 2013, Sergey Ostanevich wrote:
>> >
>> >> Richard,
>> >>
>> >> here's an example that causes trigger for the cost model.
>> >
>> > I hardly believe that (AVX2)
>> >
>> > .L9:
>> >         vmovups (%rsi), %xmm3
>> >         addl    $1, %r8d
>> >         addq    $256, %rsi
>> >         vinsertf128     $0x1, -240(%rsi), %ymm3, %ymm1
>> >         vmovups -224(%rsi), %xmm3
>> >         vinsertf128     $0x1, -208(%rsi), %ymm3, %ymm3
>> >         vshufps $136, %ymm3, %ymm1, %ymm3
>> >         vperm2f128      $3, %ymm3, %ymm3, %ymm2
>> >         vshufps $68, %ymm2, %ymm3, %ymm1
>> >         vshufps $238, %ymm2, %ymm3, %ymm2
>> >         vmovups -192(%rsi), %xmm3
>> >         vinsertf128     $1, %xmm2, %ymm1, %ymm2
>> >         vinsertf128     $0x1, -176(%rsi), %ymm3, %ymm1
>> >         vmovups -160(%rsi), %xmm3
>> >         vinsertf128     $0x1, -144(%rsi), %ymm3, %ymm3
>> >         vshufps $136, %ymm3, %ymm1, %ymm3
>> >         vperm2f128      $3, %ymm3, %ymm3, %ymm1
>> >         vshufps $68, %ymm1, %ymm3, %ymm4
>> >         vshufps $238, %ymm1, %ymm3, %ymm1
>> >         vmovups -128(%rsi), %xmm3
>> >         vinsertf128     $1, %xmm1, %ymm4, %ymm1
>> >         vshufps $136, %ymm1, %ymm2, %ymm1
>> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
>> >         vshufps $68, %ymm2, %ymm1, %ymm4
>> >         vshufps $238, %ymm2, %ymm1, %ymm2
>> >         vinsertf128     $0x1, -112(%rsi), %ymm3, %ymm1
>> >         vmovups -96(%rsi), %xmm3
>> >         vinsertf128     $1, %xmm2, %ymm4, %ymm4
>> >         vinsertf128     $0x1, -80(%rsi), %ymm3, %ymm3
>> >         vshufps $136, %ymm3, %ymm1, %ymm3
>> >         vperm2f128      $3, %ymm3, %ymm3, %ymm2
>> >         vshufps $68, %ymm2, %ymm3, %ymm1
>> >         vshufps $238, %ymm2, %ymm3, %ymm2
>> >         vmovups -64(%rsi), %xmm3
>> >         vinsertf128     $1, %xmm2, %ymm1, %ymm2
>> >         vinsertf128     $0x1, -48(%rsi), %ymm3, %ymm1
>> >         vmovups -32(%rsi), %xmm3
>> >         vinsertf128     $0x1, -16(%rsi), %ymm3, %ymm3
>> >         cmpl    %r8d, %edi
>> >         vshufps $136, %ymm3, %ymm1, %ymm3
>> >         vperm2f128      $3, %ymm3, %ymm3, %ymm1
>> >         vshufps $68, %ymm1, %ymm3, %ymm5
>> >         vshufps $238, %ymm1, %ymm3, %ymm1
>> >         vinsertf128     $1, %xmm1, %ymm5, %ymm1
>> >         vshufps $136, %ymm1, %ymm2, %ymm1
>> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
>> >         vshufps $68, %ymm2, %ymm1, %ymm3
>> >         vshufps $238, %ymm2, %ymm1, %ymm2
>> >         vinsertf128     $1, %xmm2, %ymm3, %ymm1
>> >         vshufps $136, %ymm1, %ymm4, %ymm1
>> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
>> >         vshufps $68, %ymm2, %ymm1, %ymm3
>> >         vshufps $238, %ymm2, %ymm1, %ymm2
>> >         vinsertf128     $1, %xmm2, %ymm3, %ymm2
>> >         vaddps  %ymm2, %ymm0, %ymm0
>> >         ja      .L9
>> >
>> > is more efficient than
>> >
>> > .L3:
>> >         vaddss  (%rcx,%rax), %xmm0, %xmm0
>> >         addq    $32, %rax
>> >         cmpq    %rdx, %rax
>> >         jne     .L3
>> >
>> > ;)
>> >
>> >> As soon as
>> >> elemental functions will appear and we update the vectorizer so it can accept
>> >> an elemental function inside the loop - we will have the same
>> >> situation as we have
>> >> it now: cost model will bail out with profitability estimation.
>> >
>> > Yes.
>> >
>> >> Still we have no chance to get info on how efficient the bar() function when it
>> >> is in vector form.
>> >
>> > Well I assume you mean that the speedup when vectorizing the elemental
>> > will offset whatever wreckage we cause with vectorizing the rest of the
>> > statements.  I'd say you can at least compare to unrolling by
>> > the vectorization factor, building the vector inputs to the elemental
>> > from scalars, distributing the vector result from the elemental to
>> > scalars.
>> >
>> >> I believe I should repeat: #pragma omp simd is intended for introduction of an
>> >> instruction-level parallel region on developer's request, hence should
>> >> be treated
>> >> in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle
>> >> here, not a help.
>> >
>> > Surely not if there isn't an elemental call in it.  With it the
>> > cost model of course will have not enough information to decide.
>> >
>> > But still, what's the difference to the case where we cannot vectorize
>> > the function?  What happens if we cannot vectorize the elemental?
>> > Do we have to build scalar versions for all possible vector sizes?
>> >
>> > Richard.
>> >
>> >> Regards,
>> >> Sergos
>> >>
>> >>
>> >> On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener <rguenther@suse.de> wrote:
>> >> > Sergey Ostanevich <sergos.gnu@gmail.com> wrote:
>> >> >>this is only for the whole file? I mean to have a particular loop
>> >> >>vectorized in a
>> >> >>file while all others - up to compiler's cost model. is there such a
>> >> >>machinery?
>> >> >
>> >> > No, there is not.
>> >> >
>> >> > Richard.
>> >> >
>> >> >>Sergos
>> >> >>
>> >> >>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguenther@suse.de>
>> >> >>wrote:
>> >> >>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
>> >> >>>
>> >> >>>> I will get some tests.
>> >> >>>> As for cost analysis - simply consider the pragma as a request to
>> >> >>>> vectorize. How can I - as a developer - enforce it beyond the
>> >> >>pragma?
>> >> >>>
>> >> >>> You can disable the cost model via -fvect-cost-model=unlimited
>> >> >>>
>> >> >>> Richard.
>> >> >>>
>> >> >>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de>
>> >> >>wrote:
>> >> >>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>> >> >>>> >
>> >> >>>> >> The reason patch was in its original state is because we want
>> >> >>>> >> to notify user that his assumption of profitability may be wrong.
>> >> >>>> >> This is not a part of any spec and as far as I know ICC does not
>> >> >>>> >> notify user about the case. Still it can be a good hint for those
>> >> >>>> >> users who tries to get as much as possible performance.
>> >> >>>> >>
>> >> >>>> >> Richard's comment on the vectorization problems is about the same
>> >> >>-
>> >> >>>> >> to inform user that his attempt to force vectorization is failed.
>> >> >>>> >>
>> >> >>>> >> As for profitable or not - sometimes I believe it's impossible to
>> >> >>be
>> >> >>>> >> precise. For OMP we have case of a vector version of a function
>> >> >>>> >> and we have no chance to figure out whether it is profitable to
>> >> >>use
>> >> >>>> >> it or to loose it. If we can't map the loop for any vector length
>> >> >>>> >> other than 1 - I believe in this case we have to bail out and
>> >> >>report.
>> >> >>>> >> Is it about 'never profitable'?
>> >> >>>> >
>> >> >>>> > For example.  I think we should report non-vectorized loops
>> >> >>>> > that are marked with force_vect anyway, with
>> >> >>-Wdisabled-optimization.
>> >> >>>> > Another case is that a loop may be profitable to vectorize if
>> >> >>>> > the ISA supports a gather instruction but otherwise not.  Or if
>> >> >>the
>> >> >>>> > ISA supports efficient vector construction from N not loop
>> >> >>>> > invariant scalars (for vectorization of strided loads).
>> >> >>>> >
>> >> >>>> > Simply disregarding all of the cost analysis sounds completely
>> >> >>>> > bogus to me.
>> >> >>>> >
>> >> >>>> > I'd simply go for the diagnostic for now, not changing anything
>> >> >>else.
>> >> >>>> > We want to have a good understanding about why the cost model is
>> >> >>>> > so bad that we have to force to ignore it for #pragma simd - thus
>> >> >>we
>> >> >>>> > want testcases.
>> >> >>>> >
>> >> >>>> > Richard.
>> >> >>>> >
>> >> >>>> >>
>> >> >>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
>> >> >><rguenther@suse.de> wrote:
>> >> >>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>> >> >>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
>> >> >>wrote:
>> >> >>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
>> >> >>>> >> >>> nothing related to cost model. ICC does not cancel its
>> >> >>>> >> >>> cost model in case of #pragma ivdep
>> >> >>>> >> >>>
>> >> >>>> >> >>> as for the safelen - OMP standart treats it as a limitation
>> >> >>>> >> >>> for the vector length. this means if no safelen is present
>> >> >>>> >> >>> an arbitrary vector length can be used.
>> >> >>>> >> >>
>> >> >>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
>> >> >>#pragma omp simd
>> >> >>>> >> >> without safelen clause or #pragma simd without vectorlength
>> >> >>clause.
>> >> >>>> >> >>
>> >> >>>> >> >>> so I believe loop->force_vect is the only trigger to
>> >> >>disregard
>> >> >>>> >> >>> the cost model
>> >> >>>> >> >>
>> >> >>>> >> >> Anyway, in that case I think the originally posted patch is
>> >> >>wrong,
>> >> >>>> >> >> if we want to treat force_vect as disregard all the cost model
>> >> >>and
>> >> >>>> >> >> force vectorization (well, the name of the field already kind
>> >> >>of suggest
>> >> >>>> >> >> that), then IMHO we should treat it the same as
>> >> >>-fvect-cost-model=unlimited
>> >> >>>> >> >> for those loops.
>> >> >>>> >> >
>> >> >>>> >> > Err - the user may have a specific sub-architecture in mind
>> >> >>when using
>> >> >>>> >> > #pragma simd, if you say we should completely ignore the cost
>> >> >>model
>> >> >>>> >> > then should we also sorry () if we cannot vectorize the loop
>> >> >>(either
>> >> >>>> >> > because of GCC deficiencies or lack of sub-target support)?
>> >> >>>> >> >
>> >> >>>> >> > That said, at least in the cases that the cost model says the
>> >> >>loop
>> >> >>>> >> > is never profitable to vectorize we should follow its advice.
>> >> >>>> >> >
>> >> >>>> >> > Richard.
>> >> >>>> >> >
>> >> >>>> >> >> Thus (untested):
>> >> >>>> >> >>
>> >> >>>> >> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
>> >> >>>> >> >>
>> >> >>>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters):
>> >> >>Use
>> >> >>>> >> >>       unlimited cost model also for force_vect loops.
>> >> >>>> >> >>
>> >> >>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000
>> >> >>+0100
>> >> >>>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330
>> >> >>+0100
>> >> >>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>> >> >>>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
>> >> >>(loop_vinfo);
>> >> >>>> >> >>
>> >> >>>> >> >>    /* Cost model disabled.  */
>> >> >>>> >> >> -  if (unlimited_cost_model ())
>> >> >>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
>> >> >>(loop_vinfo)->force_vect)
>> >> >>>> >> >>      {
>> >> >>>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model
>> >> >>disabled.\n");
>> >> >>>> >> >>        *ret_min_profitable_niters = 0;
>> >> >>>> >> >>
>> >> >>>> >> >>       Jakub
>> >> >>>> >> >>
>> >> >>>> >> >
>> >> >>>> >>
>> >> >>>> >>
>> >> >>>> >
>> >> >>>> > --
>> >> >>>> > Richard Biener <rguenther@suse.de>
>> >> >>>> > SUSE / SUSE Labs
>> >> >>>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> >> >>>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
>> >> >>>>
>> >> >>>>
>> >> >>>
>> >> >>> --
>> >> >>> Richard Biener <rguenther@suse.de>
>> >> >>> SUSE / SUSE Labs
>> >> >>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> >> >>> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>> >> >
>> >> >
>> >>
>> >
>> > --
>> > Richard Biener <rguenther@suse.de>
>> > SUSE / SUSE Labs
>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>>
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-19 14:48                                       ` Sergey Ostanevich
@ 2013-11-19 14:57                                         ` Richard Biener
  2013-11-19 14:58                                           ` Jakub Jelinek
  2013-11-19 15:02                                           ` Sergey Ostanevich
  0 siblings, 2 replies; 44+ messages in thread
From: Richard Biener @ 2013-11-19 14:57 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

On Tue, 19 Nov 2013, Sergey Ostanevich wrote:

> :) agree to you, but as soon as you're a user who tries to introduce
> vector code and face a bug in cost model you'd like to have a
> workaround until the bug will be fixed and compiler will come to you
> with new OS distribution, don't you?
> 
> I propose the following, yet SLP have to use a NULL as a loop info
> which looks somewhat hacky.

I think this is overengineering.  -fvect-cost-model will do as
workaround.  And -fsimd-vect-cost-model has what I consider
duplicate - "simd" and "vect".

Richard.

> Sergos
> 
> 
>         * common.opt: Added new option -fsimd-vect-cost-model
>         * tree-vectorizer.h (unlimited_cost_model): Interface update
>         to rely on particular loop info
>         * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
>         unlimited_cost_model call according to new interface
>         (vect_peeling_hash_choose_best_peeling): Ditto
>         (vect_enhance_data_refs_alignment): Ditto
>         * tree-vect-slp.c: Ditto
>         * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
>         plus issue a warning in case cost model overrides users' directive
> 
> 
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d5971df..87b3b37 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2296,6 +2296,10 @@ fvect-cost-model=
>  Common Joined RejectNegative Enum(vect_cost_model)
> Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
>  Specifies the cost model for vectorization
> 
> +fsimd-vect-cost-model=
> +Common Joined RejectNegative Enum(vect_cost_model)
> Var(flag_simd_vect_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
> +Specifies the cost model for vectorization in loops marked with
> #pragma omp simd
> +
>  Enum
>  Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
> vectorizer cost model %qs)
> 
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index 83d1f45..e26f704 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
> loop_vinfo, struct data_reference *dr,
>        *new_slot = slot;
>      }
> 
> -  if (!supportable_dr_alignment && unlimited_cost_model ())
> +  if (!supportable_dr_alignment
> +      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>      slot->count += VECT_MAX_COST;
>  }
> 
> @@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
> (loop_vec_info loop_vinfo,
>     res.peel_info.dr = NULL;
>     res.body_cost_vec = stmt_vector_for_cost ();
> 
> -   if (!unlimited_cost_model ())
> +   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>       {
>         res.inside_cost = INT_MAX;
>         res.outside_cost = INT_MAX;
> @@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
> loop_vinfo)
>                   vectorization factor.
>                   We do this automtically for cost model, since we
> calculate cost
>                   for every peeling option.  */
> -              if (unlimited_cost_model ())
> +              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>                  possible_npeel_number = vf /nelements;
> 
>                /* Handle the aligned case. We may decide to align some other
> @@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
> loop_vinfo)
>                if (DR_MISALIGNMENT (dr) == 0)
>                  {
>                    npeel_tmp = 0;
> -                  if (unlimited_cost_model ())
> +                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>                      possible_npeel_number++;
>                  }
> 
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index 86ebbd2..be66172 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters
> (loop_vec_info loop_vinfo,
>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
> 
>    /* Cost model disabled.  */
> -  if (unlimited_cost_model ())
> +  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>      {
>        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
>        *ret_min_profitable_niters = 0;
> @@ -2929,6 +2929,11 @@ vect_estimate_min_profitable_iters
> (loop_vec_info loop_vinfo,
>    /* vector version will never be profitable.  */
>    else
>      {
> +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> +        {
> +          pedwarn (vect_location, 0, "Vectorization did not happen
> for the loop");
> +        }
> +
>        if (dump_enabled_p ())
>          dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>   "cost model: the vector iteration cost = %d "
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 247bdfd..4b25964 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -2171,7 +2171,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
>      }
> 
>    /* Cost model: check if the vectorization is worthwhile.  */
> -  if (!unlimited_cost_model ()
> +  if (!unlimited_cost_model (NULL)
>        && !vect_bb_vectorization_profitable_p (bb_vinfo))
>      {
>        if (dump_enabled_p ())
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index a6c5b59..2916906 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -919,9 +919,12 @@ known_alignment_for_access_p (struct
> data_reference *data_ref_info)
> 
>  /* Return true if the vect cost model is unlimited.  */
>  static inline bool
> -unlimited_cost_model ()
> +unlimited_cost_model (loop_p loop)
>  {
> -  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
> +  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED
> +          || (loop != NULL
> +              && loop->force_vect
> +              && flag_simd_vect_cost_model == VECT_COST_MODEL_UNLIMITED));
>  }
> 
>  /* Source location */
> 
> On Mon, Nov 18, 2013 at 7:13 PM, Richard Biener <rguenther@suse.de> wrote:
> > On Mon, 18 Nov 2013, Sergey Ostanevich wrote:
> >
> >> I would agree that the example is just for the case cost model makes
> >> correct estimation But how can we assure ourself that it won't have any
> >> mistakes in the future?
> >
> > We call it bugs and not mistakes and we have bugzilla for it.
> >
> > Richard.
> >
> >> I believe it'll be Ok to introduce an extra flag as Jakub proposed for the
> >> dedicated simd-forced vectorization to use unlimited cost model. This
> >> can be default for -fopenmp or there should be a warning issued that
> >> compiler overrides user's request of vectorization. In such a case user
> >> can enforce vectorization (even with mentioned results :) with this
> >> unlimited cost model for simd.
> >>
> >>
> >>
> >> On Fri, Nov 15, 2013 at 6:24 PM, Richard Biener <rguenther@suse.de> wrote:
> >> > On Fri, 15 Nov 2013, Sergey Ostanevich wrote:
> >> >
> >> >> Richard,
> >> >>
> >> >> here's an example that causes trigger for the cost model.
> >> >
> >> > I hardly believe that (AVX2)
> >> >
> >> > .L9:
> >> >         vmovups (%rsi), %xmm3
> >> >         addl    $1, %r8d
> >> >         addq    $256, %rsi
> >> >         vinsertf128     $0x1, -240(%rsi), %ymm3, %ymm1
> >> >         vmovups -224(%rsi), %xmm3
> >> >         vinsertf128     $0x1, -208(%rsi), %ymm3, %ymm3
> >> >         vshufps $136, %ymm3, %ymm1, %ymm3
> >> >         vperm2f128      $3, %ymm3, %ymm3, %ymm2
> >> >         vshufps $68, %ymm2, %ymm3, %ymm1
> >> >         vshufps $238, %ymm2, %ymm3, %ymm2
> >> >         vmovups -192(%rsi), %xmm3
> >> >         vinsertf128     $1, %xmm2, %ymm1, %ymm2
> >> >         vinsertf128     $0x1, -176(%rsi), %ymm3, %ymm1
> >> >         vmovups -160(%rsi), %xmm3
> >> >         vinsertf128     $0x1, -144(%rsi), %ymm3, %ymm3
> >> >         vshufps $136, %ymm3, %ymm1, %ymm3
> >> >         vperm2f128      $3, %ymm3, %ymm3, %ymm1
> >> >         vshufps $68, %ymm1, %ymm3, %ymm4
> >> >         vshufps $238, %ymm1, %ymm3, %ymm1
> >> >         vmovups -128(%rsi), %xmm3
> >> >         vinsertf128     $1, %xmm1, %ymm4, %ymm1
> >> >         vshufps $136, %ymm1, %ymm2, %ymm1
> >> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
> >> >         vshufps $68, %ymm2, %ymm1, %ymm4
> >> >         vshufps $238, %ymm2, %ymm1, %ymm2
> >> >         vinsertf128     $0x1, -112(%rsi), %ymm3, %ymm1
> >> >         vmovups -96(%rsi), %xmm3
> >> >         vinsertf128     $1, %xmm2, %ymm4, %ymm4
> >> >         vinsertf128     $0x1, -80(%rsi), %ymm3, %ymm3
> >> >         vshufps $136, %ymm3, %ymm1, %ymm3
> >> >         vperm2f128      $3, %ymm3, %ymm3, %ymm2
> >> >         vshufps $68, %ymm2, %ymm3, %ymm1
> >> >         vshufps $238, %ymm2, %ymm3, %ymm2
> >> >         vmovups -64(%rsi), %xmm3
> >> >         vinsertf128     $1, %xmm2, %ymm1, %ymm2
> >> >         vinsertf128     $0x1, -48(%rsi), %ymm3, %ymm1
> >> >         vmovups -32(%rsi), %xmm3
> >> >         vinsertf128     $0x1, -16(%rsi), %ymm3, %ymm3
> >> >         cmpl    %r8d, %edi
> >> >         vshufps $136, %ymm3, %ymm1, %ymm3
> >> >         vperm2f128      $3, %ymm3, %ymm3, %ymm1
> >> >         vshufps $68, %ymm1, %ymm3, %ymm5
> >> >         vshufps $238, %ymm1, %ymm3, %ymm1
> >> >         vinsertf128     $1, %xmm1, %ymm5, %ymm1
> >> >         vshufps $136, %ymm1, %ymm2, %ymm1
> >> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
> >> >         vshufps $68, %ymm2, %ymm1, %ymm3
> >> >         vshufps $238, %ymm2, %ymm1, %ymm2
> >> >         vinsertf128     $1, %xmm2, %ymm3, %ymm1
> >> >         vshufps $136, %ymm1, %ymm4, %ymm1
> >> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
> >> >         vshufps $68, %ymm2, %ymm1, %ymm3
> >> >         vshufps $238, %ymm2, %ymm1, %ymm2
> >> >         vinsertf128     $1, %xmm2, %ymm3, %ymm2
> >> >         vaddps  %ymm2, %ymm0, %ymm0
> >> >         ja      .L9
> >> >
> >> > is more efficient than
> >> >
> >> > .L3:
> >> >         vaddss  (%rcx,%rax), %xmm0, %xmm0
> >> >         addq    $32, %rax
> >> >         cmpq    %rdx, %rax
> >> >         jne     .L3
> >> >
> >> > ;)
> >> >
> >> >> As soon as
> >> >> elemental functions will appear and we update the vectorizer so it can accept
> >> >> an elemental function inside the loop - we will have the same
> >> >> situation as we have
> >> >> it now: cost model will bail out with profitability estimation.
> >> >
> >> > Yes.
> >> >
> >> >> Still we have no chance to get info on how efficient the bar() function when it
> >> >> is in vector form.
> >> >
> >> > Well I assume you mean that the speedup when vectorizing the elemental
> >> > will offset whatever wreckage we cause with vectorizing the rest of the
> >> > statements.  I'd say you can at least compare to unrolling by
> >> > the vectorization factor, building the vector inputs to the elemental
> >> > from scalars, distributing the vector result from the elemental to
> >> > scalars.
> >> >
> >> >> I believe I should repeat: #pragma omp simd is intended for introduction of an
> >> >> instruction-level parallel region on developer's request, hence should
> >> >> be treated
> >> >> in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle
> >> >> here, not a help.
> >> >
> >> > Surely not if there isn't an elemental call in it.  With it the
> >> > cost model of course will have not enough information to decide.
> >> >
> >> > But still, what's the difference to the case where we cannot vectorize
> >> > the function?  What happens if we cannot vectorize the elemental?
> >> > Do we have to build scalar versions for all possible vector sizes?
> >> >
> >> > Richard.
> >> >
> >> >> Regards,
> >> >> Sergos
> >> >>
> >> >>
> >> >> On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener <rguenther@suse.de> wrote:
> >> >> > Sergey Ostanevich <sergos.gnu@gmail.com> wrote:
> >> >> >>this is only for the whole file? I mean to have a particular loop
> >> >> >>vectorized in a
> >> >> >>file while all others - up to compiler's cost model. is there such a
> >> >> >>machinery?
> >> >> >
> >> >> > No, there is not.
> >> >> >
> >> >> > Richard.
> >> >> >
> >> >> >>Sergos
> >> >> >>
> >> >> >>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguenther@suse.de>
> >> >> >>wrote:
> >> >> >>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
> >> >> >>>
> >> >> >>>> I will get some tests.
> >> >> >>>> As for cost analysis - simply consider the pragma as a request to
> >> >> >>>> vectorize. How can I - as a developer - enforce it beyond the
> >> >> >>pragma?
> >> >> >>>
> >> >> >>> You can disable the cost model via -fvect-cost-model=unlimited
> >> >> >>>
> >> >> >>> Richard.
> >> >> >>>
> >> >> >>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de>
> >> >> >>wrote:
> >> >> >>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
> >> >> >>>> >
> >> >> >>>> >> The reason patch was in its original state is because we want
> >> >> >>>> >> to notify user that his assumption of profitability may be wrong.
> >> >> >>>> >> This is not a part of any spec and as far as I know ICC does not
> >> >> >>>> >> notify user about the case. Still it can be a good hint for those
> >> >> >>>> >> users who tries to get as much as possible performance.
> >> >> >>>> >>
> >> >> >>>> >> Richard's comment on the vectorization problems is about the same
> >> >> >>-
> >> >> >>>> >> to inform user that his attempt to force vectorization is failed.
> >> >> >>>> >>
> >> >> >>>> >> As for profitable or not - sometimes I believe it's impossible to
> >> >> >>be
> >> >> >>>> >> precise. For OMP we have case of a vector version of a function
> >> >> >>>> >> and we have no chance to figure out whether it is profitable to
> >> >> >>use
> >> >> >>>> >> it or to loose it. If we can't map the loop for any vector length
> >> >> >>>> >> other than 1 - I believe in this case we have to bail out and
> >> >> >>report.
> >> >> >>>> >> Is it about 'never profitable'?
> >> >> >>>> >
> >> >> >>>> > For example.  I think we should report non-vectorized loops
> >> >> >>>> > that are marked with force_vect anyway, with
> >> >> >>-Wdisabled-optimization.
> >> >> >>>> > Another case is that a loop may be profitable to vectorize if
> >> >> >>>> > the ISA supports a gather instruction but otherwise not.  Or if
> >> >> >>the
> >> >> >>>> > ISA supports efficient vector construction from N not loop
> >> >> >>>> > invariant scalars (for vectorization of strided loads).
> >> >> >>>> >
> >> >> >>>> > Simply disregarding all of the cost analysis sounds completely
> >> >> >>>> > bogus to me.
> >> >> >>>> >
> >> >> >>>> > I'd simply go for the diagnostic for now, not changing anything
> >> >> >>else.
> >> >> >>>> > We want to have a good understanding about why the cost model is
> >> >> >>>> > so bad that we have to force to ignore it for #pragma simd - thus
> >> >> >>we
> >> >> >>>> > want testcases.
> >> >> >>>> >
> >> >> >>>> > Richard.
> >> >> >>>> >
> >> >> >>>> >>
> >> >> >>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
> >> >> >><rguenther@suse.de> wrote:
> >> >> >>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
> >> >> >>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
> >> >> >>wrote:
> >> >> >>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
> >> >> >>>> >> >>> nothing related to cost model. ICC does not cancel its
> >> >> >>>> >> >>> cost model in case of #pragma ivdep
> >> >> >>>> >> >>>
> >> >> >>>> >> >>> as for the safelen - OMP standart treats it as a limitation
> >> >> >>>> >> >>> for the vector length. this means if no safelen is present
> >> >> >>>> >> >>> an arbitrary vector length can be used.
> >> >> >>>> >> >>
> >> >> >>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
> >> >> >>#pragma omp simd
> >> >> >>>> >> >> without safelen clause or #pragma simd without vectorlength
> >> >> >>clause.
> >> >> >>>> >> >>
> >> >> >>>> >> >>> so I believe loop->force_vect is the only trigger to
> >> >> >>disregard
> >> >> >>>> >> >>> the cost model
> >> >> >>>> >> >>
> >> >> >>>> >> >> Anyway, in that case I think the originally posted patch is
> >> >> >>wrong,
> >> >> >>>> >> >> if we want to treat force_vect as disregard all the cost model
> >> >> >>and
> >> >> >>>> >> >> force vectorization (well, the name of the field already kind
> >> >> >>of suggest
> >> >> >>>> >> >> that), then IMHO we should treat it the same as
> >> >> >>-fvect-cost-model=unlimited
> >> >> >>>> >> >> for those loops.
> >> >> >>>> >> >
> >> >> >>>> >> > Err - the user may have a specific sub-architecture in mind
> >> >> >>when using
> >> >> >>>> >> > #pragma simd, if you say we should completely ignore the cost
> >> >> >>model
> >> >> >>>> >> > then should we also sorry () if we cannot vectorize the loop
> >> >> >>(either
> >> >> >>>> >> > because of GCC deficiencies or lack of sub-target support)?
> >> >> >>>> >> >
> >> >> >>>> >> > That said, at least in the cases that the cost model says the
> >> >> >>loop
> >> >> >>>> >> > is never profitable to vectorize we should follow its advice.
> >> >> >>>> >> >
> >> >> >>>> >> > Richard.
> >> >> >>>> >> >
> >> >> >>>> >> >> Thus (untested):
> >> >> >>>> >> >>
> >> >> >>>> >> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
> >> >> >>>> >> >>
> >> >> >>>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters):
> >> >> >>Use
> >> >> >>>> >> >>       unlimited cost model also for force_vect loops.
> >> >> >>>> >> >>
> >> >> >>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000
> >> >> >>+0100
> >> >> >>>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330
> >> >> >>+0100
> >> >> >>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
> >> >> >>>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
> >> >> >>(loop_vinfo);
> >> >> >>>> >> >>
> >> >> >>>> >> >>    /* Cost model disabled.  */
> >> >> >>>> >> >> -  if (unlimited_cost_model ())
> >> >> >>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
> >> >> >>(loop_vinfo)->force_vect)
> >> >> >>>> >> >>      {
> >> >> >>>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model
> >> >> >>disabled.\n");
> >> >> >>>> >> >>        *ret_min_profitable_niters = 0;
> >> >> >>>> >> >>
> >> >> >>>> >> >>       Jakub
> >> >> >>>> >> >>
> >> >> >>>> >> >
> >> >> >>>> >>
> >> >> >>>> >>
> >> >> >>>> >
> >> >> >>>> > --
> >> >> >>>> > Richard Biener <rguenther@suse.de>
> >> >> >>>> > SUSE / SUSE Labs
> >> >> >>>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> >> >> >>>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
> >> >> >>>>
> >> >> >>>>
> >> >> >>>
> >> >> >>> --
> >> >> >>> Richard Biener <rguenther@suse.de>
> >> >> >>> SUSE / SUSE Labs
> >> >> >>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> >> >> >>> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
> >> >> >
> >> >> >
> >> >>
> >> >
> >> > --
> >> > Richard Biener <rguenther@suse.de>
> >> > SUSE / SUSE Labs
> >> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> >> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
> >>
> >>
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE / SUSE Labs
> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-19 14:57                                         ` Richard Biener
@ 2013-11-19 14:58                                           ` Jakub Jelinek
  2013-11-19 15:07                                             ` Sergey Ostanevich
  2013-11-19 15:02                                           ` Sergey Ostanevich
  1 sibling, 1 reply; 44+ messages in thread
From: Jakub Jelinek @ 2013-11-19 14:58 UTC (permalink / raw)
  To: Richard Biener
  Cc: Sergey Ostanevich, Richard Henderson, Yuri Rumyantsev,
	gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

On Tue, Nov 19, 2013 at 03:07:52PM +0100, Richard Biener wrote:
> On Tue, 19 Nov 2013, Sergey Ostanevich wrote:
> 
> > :) agree to you, but as soon as you're a user who tries to introduce
> > vector code and face a bug in cost model you'd like to have a
> > workaround until the bug will be fixed and compiler will come to you
> > with new OS distribution, don't you?
> > 
> > I propose the following, yet SLP have to use a NULL as a loop info
> > which looks somewhat hacky.
> 
> I think this is overengineering.  -fvect-cost-model will do as
> workaround.  And -fsimd-vect-cost-model has what I consider
> duplicate - "simd" and "vect".

I think it is a good idea, though I agree about s/simd-vect/simd/ and
I'd use VECT_COST_MODEL_DEFAULT as the default, which would mean
just use -fvect-cost-model.

> > @@ -2929,6 +2929,11 @@ vect_estimate_min_profitable_iters
> > (loop_vec_info loop_vinfo,
> >    /* vector version will never be profitable.  */
> >    else
> >      {
> > +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> > +        {
> > +          pedwarn (vect_location, 0, "Vectorization did not happen
> > for the loop");
> > +        }

pedwarn isn't really desirable for this, you want just warning,
but some warning you can actually also turn off.
-Wopenmp-simd (and we'd use it also when we ignore #pragma omp declare simd
because it wasn't useful/desirable).

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-19 14:57                                         ` Richard Biener
  2013-11-19 14:58                                           ` Jakub Jelinek
@ 2013-11-19 15:02                                           ` Sergey Ostanevich
  1 sibling, 0 replies; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-19 15:02 UTC (permalink / raw)
  To: Richard Biener
  Cc: Jakub Jelinek, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

On Tue, Nov 19, 2013 at 6:07 PM, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 19 Nov 2013, Sergey Ostanevich wrote:
>
>> :) agree to you, but as soon as you're a user who tries to introduce
>> vector code and face a bug in cost model you'd like to have a
>> workaround until the bug will be fixed and compiler will come to you
>> with new OS distribution, don't you?
>>
>> I propose the following, yet SLP have to use a NULL as a loop info
>> which looks somewhat hacky.
>
> I think this is overengineering.  -fvect-cost-model will do as
> workaround.  And -fsimd-vect-cost-model has what I consider
> duplicate - "simd" and "vect".

I just wanted to separate the autovectorized loops from ones user
wants to vectorize. The -fvect-cost-model will force all at once.
That's the reason to introcude the simd-vect, since pragma name
is simd.

>
> Richard.
>
>> Sergos
>>
>>
>>         * common.opt: Added new option -fsimd-vect-cost-model
>>         * tree-vectorizer.h (unlimited_cost_model): Interface update
>>         to rely on particular loop info
>>         * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
>>         unlimited_cost_model call according to new interface
>>         (vect_peeling_hash_choose_best_peeling): Ditto
>>         (vect_enhance_data_refs_alignment): Ditto
>>         * tree-vect-slp.c: Ditto
>>         * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
>>         plus issue a warning in case cost model overrides users' directive
>>
>>
>>
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index d5971df..87b3b37 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -2296,6 +2296,10 @@ fvect-cost-model=
>>  Common Joined RejectNegative Enum(vect_cost_model)
>> Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
>>  Specifies the cost model for vectorization
>>
>> +fsimd-vect-cost-model=
>> +Common Joined RejectNegative Enum(vect_cost_model)
>> Var(flag_simd_vect_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
>> +Specifies the cost model for vectorization in loops marked with
>> #pragma omp simd
>> +
>>  Enum
>>  Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
>> vectorizer cost model %qs)
>>
>> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>> index 83d1f45..e26f704 100644
>> --- a/gcc/tree-vect-data-refs.c
>> +++ b/gcc/tree-vect-data-refs.c
>> @@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
>> loop_vinfo, struct data_reference *dr,
>>        *new_slot = slot;
>>      }
>>
>> -  if (!supportable_dr_alignment && unlimited_cost_model ())
>> +  if (!supportable_dr_alignment
>> +      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>>      slot->count += VECT_MAX_COST;
>>  }
>>
>> @@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
>> (loop_vec_info loop_vinfo,
>>     res.peel_info.dr = NULL;
>>     res.body_cost_vec = stmt_vector_for_cost ();
>>
>> -   if (!unlimited_cost_model ())
>> +   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>>       {
>>         res.inside_cost = INT_MAX;
>>         res.outside_cost = INT_MAX;
>> @@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
>> loop_vinfo)
>>                   vectorization factor.
>>                   We do this automtically for cost model, since we
>> calculate cost
>>                   for every peeling option.  */
>> -              if (unlimited_cost_model ())
>> +              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>>                  possible_npeel_number = vf /nelements;
>>
>>                /* Handle the aligned case. We may decide to align some other
>> @@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
>> loop_vinfo)
>>                if (DR_MISALIGNMENT (dr) == 0)
>>                  {
>>                    npeel_tmp = 0;
>> -                  if (unlimited_cost_model ())
>> +                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>>                      possible_npeel_number++;
>>                  }
>>
>> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
>> index 86ebbd2..be66172 100644
>> --- a/gcc/tree-vect-loop.c
>> +++ b/gcc/tree-vect-loop.c
>> @@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters
>> (loop_vec_info loop_vinfo,
>>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
>>
>>    /* Cost model disabled.  */
>> -  if (unlimited_cost_model ())
>> +  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>>      {
>>        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
>>        *ret_min_profitable_niters = 0;
>> @@ -2929,6 +2929,11 @@ vect_estimate_min_profitable_iters
>> (loop_vec_info loop_vinfo,
>>    /* vector version will never be profitable.  */
>>    else
>>      {
>> +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
>> +        {
>> +          pedwarn (vect_location, 0, "Vectorization did not happen
>> for the loop");
>> +        }
>> +
>>        if (dump_enabled_p ())
>>          dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>>   "cost model: the vector iteration cost = %d "
>> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
>> index 247bdfd..4b25964 100644
>> --- a/gcc/tree-vect-slp.c
>> +++ b/gcc/tree-vect-slp.c
>> @@ -2171,7 +2171,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
>>      }
>>
>>    /* Cost model: check if the vectorization is worthwhile.  */
>> -  if (!unlimited_cost_model ()
>> +  if (!unlimited_cost_model (NULL)
>>        && !vect_bb_vectorization_profitable_p (bb_vinfo))
>>      {
>>        if (dump_enabled_p ())
>> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
>> index a6c5b59..2916906 100644
>> --- a/gcc/tree-vectorizer.h
>> +++ b/gcc/tree-vectorizer.h
>> @@ -919,9 +919,12 @@ known_alignment_for_access_p (struct
>> data_reference *data_ref_info)
>>
>>  /* Return true if the vect cost model is unlimited.  */
>>  static inline bool
>> -unlimited_cost_model ()
>> +unlimited_cost_model (loop_p loop)
>>  {
>> -  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
>> +  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED
>> +          || (loop != NULL
>> +              && loop->force_vect
>> +              && flag_simd_vect_cost_model == VECT_COST_MODEL_UNLIMITED));
>>  }
>>
>>  /* Source location */
>>
>> On Mon, Nov 18, 2013 at 7:13 PM, Richard Biener <rguenther@suse.de> wrote:
>> > On Mon, 18 Nov 2013, Sergey Ostanevich wrote:
>> >
>> >> I would agree that the example is just for the case cost model makes
>> >> correct estimation But how can we assure ourself that it won't have any
>> >> mistakes in the future?
>> >
>> > We call it bugs and not mistakes and we have bugzilla for it.
>> >
>> > Richard.
>> >
>> >> I believe it'll be Ok to introduce an extra flag as Jakub proposed for the
>> >> dedicated simd-forced vectorization to use unlimited cost model. This
>> >> can be default for -fopenmp or there should be a warning issued that
>> >> compiler overrides user's request of vectorization. In such a case user
>> >> can enforce vectorization (even with mentioned results :) with this
>> >> unlimited cost model for simd.
>> >>
>> >>
>> >>
>> >> On Fri, Nov 15, 2013 at 6:24 PM, Richard Biener <rguenther@suse.de> wrote:
>> >> > On Fri, 15 Nov 2013, Sergey Ostanevich wrote:
>> >> >
>> >> >> Richard,
>> >> >>
>> >> >> here's an example that causes trigger for the cost model.
>> >> >
>> >> > I hardly believe that (AVX2)
>> >> >
>> >> > .L9:
>> >> >         vmovups (%rsi), %xmm3
>> >> >         addl    $1, %r8d
>> >> >         addq    $256, %rsi
>> >> >         vinsertf128     $0x1, -240(%rsi), %ymm3, %ymm1
>> >> >         vmovups -224(%rsi), %xmm3
>> >> >         vinsertf128     $0x1, -208(%rsi), %ymm3, %ymm3
>> >> >         vshufps $136, %ymm3, %ymm1, %ymm3
>> >> >         vperm2f128      $3, %ymm3, %ymm3, %ymm2
>> >> >         vshufps $68, %ymm2, %ymm3, %ymm1
>> >> >         vshufps $238, %ymm2, %ymm3, %ymm2
>> >> >         vmovups -192(%rsi), %xmm3
>> >> >         vinsertf128     $1, %xmm2, %ymm1, %ymm2
>> >> >         vinsertf128     $0x1, -176(%rsi), %ymm3, %ymm1
>> >> >         vmovups -160(%rsi), %xmm3
>> >> >         vinsertf128     $0x1, -144(%rsi), %ymm3, %ymm3
>> >> >         vshufps $136, %ymm3, %ymm1, %ymm3
>> >> >         vperm2f128      $3, %ymm3, %ymm3, %ymm1
>> >> >         vshufps $68, %ymm1, %ymm3, %ymm4
>> >> >         vshufps $238, %ymm1, %ymm3, %ymm1
>> >> >         vmovups -128(%rsi), %xmm3
>> >> >         vinsertf128     $1, %xmm1, %ymm4, %ymm1
>> >> >         vshufps $136, %ymm1, %ymm2, %ymm1
>> >> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
>> >> >         vshufps $68, %ymm2, %ymm1, %ymm4
>> >> >         vshufps $238, %ymm2, %ymm1, %ymm2
>> >> >         vinsertf128     $0x1, -112(%rsi), %ymm3, %ymm1
>> >> >         vmovups -96(%rsi), %xmm3
>> >> >         vinsertf128     $1, %xmm2, %ymm4, %ymm4
>> >> >         vinsertf128     $0x1, -80(%rsi), %ymm3, %ymm3
>> >> >         vshufps $136, %ymm3, %ymm1, %ymm3
>> >> >         vperm2f128      $3, %ymm3, %ymm3, %ymm2
>> >> >         vshufps $68, %ymm2, %ymm3, %ymm1
>> >> >         vshufps $238, %ymm2, %ymm3, %ymm2
>> >> >         vmovups -64(%rsi), %xmm3
>> >> >         vinsertf128     $1, %xmm2, %ymm1, %ymm2
>> >> >         vinsertf128     $0x1, -48(%rsi), %ymm3, %ymm1
>> >> >         vmovups -32(%rsi), %xmm3
>> >> >         vinsertf128     $0x1, -16(%rsi), %ymm3, %ymm3
>> >> >         cmpl    %r8d, %edi
>> >> >         vshufps $136, %ymm3, %ymm1, %ymm3
>> >> >         vperm2f128      $3, %ymm3, %ymm3, %ymm1
>> >> >         vshufps $68, %ymm1, %ymm3, %ymm5
>> >> >         vshufps $238, %ymm1, %ymm3, %ymm1
>> >> >         vinsertf128     $1, %xmm1, %ymm5, %ymm1
>> >> >         vshufps $136, %ymm1, %ymm2, %ymm1
>> >> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
>> >> >         vshufps $68, %ymm2, %ymm1, %ymm3
>> >> >         vshufps $238, %ymm2, %ymm1, %ymm2
>> >> >         vinsertf128     $1, %xmm2, %ymm3, %ymm1
>> >> >         vshufps $136, %ymm1, %ymm4, %ymm1
>> >> >         vperm2f128      $3, %ymm1, %ymm1, %ymm2
>> >> >         vshufps $68, %ymm2, %ymm1, %ymm3
>> >> >         vshufps $238, %ymm2, %ymm1, %ymm2
>> >> >         vinsertf128     $1, %xmm2, %ymm3, %ymm2
>> >> >         vaddps  %ymm2, %ymm0, %ymm0
>> >> >         ja      .L9
>> >> >
>> >> > is more efficient than
>> >> >
>> >> > .L3:
>> >> >         vaddss  (%rcx,%rax), %xmm0, %xmm0
>> >> >         addq    $32, %rax
>> >> >         cmpq    %rdx, %rax
>> >> >         jne     .L3
>> >> >
>> >> > ;)
>> >> >
>> >> >> As soon as
>> >> >> elemental functions will appear and we update the vectorizer so it can accept
>> >> >> an elemental function inside the loop - we will have the same
>> >> >> situation as we have
>> >> >> it now: cost model will bail out with profitability estimation.
>> >> >
>> >> > Yes.
>> >> >
>> >> >> Still we have no chance to get info on how efficient the bar() function when it
>> >> >> is in vector form.
>> >> >
>> >> > Well I assume you mean that the speedup when vectorizing the elemental
>> >> > will offset whatever wreckage we cause with vectorizing the rest of the
>> >> > statements.  I'd say you can at least compare to unrolling by
>> >> > the vectorization factor, building the vector inputs to the elemental
>> >> > from scalars, distributing the vector result from the elemental to
>> >> > scalars.
>> >> >
>> >> >> I believe I should repeat: #pragma omp simd is intended for introduction of an
>> >> >> instruction-level parallel region on developer's request, hence should
>> >> >> be treated
>> >> >> in same manner as #pragma omp parallel. Vectorizer cost model is an obstacle
>> >> >> here, not a help.
>> >> >
>> >> > Surely not if there isn't an elemental call in it.  With it the
>> >> > cost model of course will have not enough information to decide.
>> >> >
>> >> > But still, what's the difference to the case where we cannot vectorize
>> >> > the function?  What happens if we cannot vectorize the elemental?
>> >> > Do we have to build scalar versions for all possible vector sizes?
>> >> >
>> >> > Richard.
>> >> >
>> >> >> Regards,
>> >> >> Sergos
>> >> >>
>> >> >>
>> >> >> On Fri, Nov 15, 2013 at 1:08 AM, Richard Biener <rguenther@suse.de> wrote:
>> >> >> > Sergey Ostanevich <sergos.gnu@gmail.com> wrote:
>> >> >> >>this is only for the whole file? I mean to have a particular loop
>> >> >> >>vectorized in a
>> >> >> >>file while all others - up to compiler's cost model. is there such a
>> >> >> >>machinery?
>> >> >> >
>> >> >> > No, there is not.
>> >> >> >
>> >> >> > Richard.
>> >> >> >
>> >> >> >>Sergos
>> >> >> >>
>> >> >> >>On Thu, Nov 14, 2013 at 12:39 PM, Richard Biener <rguenther@suse.de>
>> >> >> >>wrote:
>> >> >> >>> On Wed, 13 Nov 2013, Sergey Ostanevich wrote:
>> >> >> >>>
>> >> >> >>>> I will get some tests.
>> >> >> >>>> As for cost analysis - simply consider the pragma as a request to
>> >> >> >>>> vectorize. How can I - as a developer - enforce it beyond the
>> >> >> >>pragma?
>> >> >> >>>
>> >> >> >>> You can disable the cost model via -fvect-cost-model=unlimited
>> >> >> >>>
>> >> >> >>> Richard.
>> >> >> >>>
>> >> >> >>>> On Wed, Nov 13, 2013 at 12:55 PM, Richard Biener <rguenther@suse.de>
>> >> >> >>wrote:
>> >> >> >>>> > On Tue, 12 Nov 2013, Sergey Ostanevich wrote:
>> >> >> >>>> >
>> >> >> >>>> >> The reason patch was in its original state is because we want
>> >> >> >>>> >> to notify user that his assumption of profitability may be wrong.
>> >> >> >>>> >> This is not a part of any spec and as far as I know ICC does not
>> >> >> >>>> >> notify user about the case. Still it can be a good hint for those
>> >> >> >>>> >> users who tries to get as much as possible performance.
>> >> >> >>>> >>
>> >> >> >>>> >> Richard's comment on the vectorization problems is about the same
>> >> >> >>-
>> >> >> >>>> >> to inform user that his attempt to force vectorization is failed.
>> >> >> >>>> >>
>> >> >> >>>> >> As for profitable or not - sometimes I believe it's impossible to
>> >> >> >>be
>> >> >> >>>> >> precise. For OMP we have case of a vector version of a function
>> >> >> >>>> >> and we have no chance to figure out whether it is profitable to
>> >> >> >>use
>> >> >> >>>> >> it or to loose it. If we can't map the loop for any vector length
>> >> >> >>>> >> other than 1 - I believe in this case we have to bail out and
>> >> >> >>report.
>> >> >> >>>> >> Is it about 'never profitable'?
>> >> >> >>>> >
>> >> >> >>>> > For example.  I think we should report non-vectorized loops
>> >> >> >>>> > that are marked with force_vect anyway, with
>> >> >> >>-Wdisabled-optimization.
>> >> >> >>>> > Another case is that a loop may be profitable to vectorize if
>> >> >> >>>> > the ISA supports a gather instruction but otherwise not.  Or if
>> >> >> >>the
>> >> >> >>>> > ISA supports efficient vector construction from N not loop
>> >> >> >>>> > invariant scalars (for vectorization of strided loads).
>> >> >> >>>> >
>> >> >> >>>> > Simply disregarding all of the cost analysis sounds completely
>> >> >> >>>> > bogus to me.
>> >> >> >>>> >
>> >> >> >>>> > I'd simply go for the diagnostic for now, not changing anything
>> >> >> >>else.
>> >> >> >>>> > We want to have a good understanding about why the cost model is
>> >> >> >>>> > so bad that we have to force to ignore it for #pragma simd - thus
>> >> >> >>we
>> >> >> >>>> > want testcases.
>> >> >> >>>> >
>> >> >> >>>> > Richard.
>> >> >> >>>> >
>> >> >> >>>> >>
>> >> >> >>>> >> On Tue, Nov 12, 2013 at 6:35 PM, Richard Biener
>> >> >> >><rguenther@suse.de> wrote:
>> >> >> >>>> >> > On 11/12/13 3:16 PM, Jakub Jelinek wrote:
>> >> >> >>>> >> >> On Tue, Nov 12, 2013 at 05:46:14PM +0400, Sergey Ostanevich
>> >> >> >>wrote:
>> >> >> >>>> >> >>> ivdep just substitutes all cross-iteration data analysis,
>> >> >> >>>> >> >>> nothing related to cost model. ICC does not cancel its
>> >> >> >>>> >> >>> cost model in case of #pragma ivdep
>> >> >> >>>> >> >>>
>> >> >> >>>> >> >>> as for the safelen - OMP standart treats it as a limitation
>> >> >> >>>> >> >>> for the vector length. this means if no safelen is present
>> >> >> >>>> >> >>> an arbitrary vector length can be used.
>> >> >> >>>> >> >>
>> >> >> >>>> >> >> I was talking about GCC loop->safelen, which is INT_MAX for
>> >> >> >>#pragma omp simd
>> >> >> >>>> >> >> without safelen clause or #pragma simd without vectorlength
>> >> >> >>clause.
>> >> >> >>>> >> >>
>> >> >> >>>> >> >>> so I believe loop->force_vect is the only trigger to
>> >> >> >>disregard
>> >> >> >>>> >> >>> the cost model
>> >> >> >>>> >> >>
>> >> >> >>>> >> >> Anyway, in that case I think the originally posted patch is
>> >> >> >>wrong,
>> >> >> >>>> >> >> if we want to treat force_vect as disregard all the cost model
>> >> >> >>and
>> >> >> >>>> >> >> force vectorization (well, the name of the field already kind
>> >> >> >>of suggest
>> >> >> >>>> >> >> that), then IMHO we should treat it the same as
>> >> >> >>-fvect-cost-model=unlimited
>> >> >> >>>> >> >> for those loops.
>> >> >> >>>> >> >
>> >> >> >>>> >> > Err - the user may have a specific sub-architecture in mind
>> >> >> >>when using
>> >> >> >>>> >> > #pragma simd, if you say we should completely ignore the cost
>> >> >> >>model
>> >> >> >>>> >> > then should we also sorry () if we cannot vectorize the loop
>> >> >> >>(either
>> >> >> >>>> >> > because of GCC deficiencies or lack of sub-target support)?
>> >> >> >>>> >> >
>> >> >> >>>> >> > That said, at least in the cases that the cost model says the
>> >> >> >>loop
>> >> >> >>>> >> > is never profitable to vectorize we should follow its advice.
>> >> >> >>>> >> >
>> >> >> >>>> >> > Richard.
>> >> >> >>>> >> >
>> >> >> >>>> >> >> Thus (untested):
>> >> >> >>>> >> >>
>> >> >> >>>> >> >> 2013-11-12  Jakub Jelinek  <jakub@redhat.com>
>> >> >> >>>> >> >>
>> >> >> >>>> >> >>       * tree-vect-loop.c (vect_estimate_min_profitable_iters):
>> >> >> >>Use
>> >> >> >>>> >> >>       unlimited cost model also for force_vect loops.
>> >> >> >>>> >> >>
>> >> >> >>>> >> >> --- gcc/tree-vect-loop.c.jj   2013-11-12 12:09:40.000000000
>> >> >> >>+0100
>> >> >> >>>> >> >> +++ gcc/tree-vect-loop.c      2013-11-12 15:11:43.821404330
>> >> >> >>+0100
>> >> >> >>>> >> >> @@ -2702,7 +2702,7 @@ vect_estimate_min_profitable_iters (loop
>> >> >> >>>> >> >>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA
>> >> >> >>(loop_vinfo);
>> >> >> >>>> >> >>
>> >> >> >>>> >> >>    /* Cost model disabled.  */
>> >> >> >>>> >> >> -  if (unlimited_cost_model ())
>> >> >> >>>> >> >> +  if (unlimited_cost_model () || LOOP_VINFO_LOOP
>> >> >> >>(loop_vinfo)->force_vect)
>> >> >> >>>> >> >>      {
>> >> >> >>>> >> >>        dump_printf_loc (MSG_NOTE, vect_location, "cost model
>> >> >> >>disabled.\n");
>> >> >> >>>> >> >>        *ret_min_profitable_niters = 0;
>> >> >> >>>> >> >>
>> >> >> >>>> >> >>       Jakub
>> >> >> >>>> >> >>
>> >> >> >>>> >> >
>> >> >> >>>> >>
>> >> >> >>>> >>
>> >> >> >>>> >
>> >> >> >>>> > --
>> >> >> >>>> > Richard Biener <rguenther@suse.de>
>> >> >> >>>> > SUSE / SUSE Labs
>> >> >> >>>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> >> >> >>>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>
>> >> >> >>> --
>> >> >> >>> Richard Biener <rguenther@suse.de>
>> >> >> >>> SUSE / SUSE Labs
>> >> >> >>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> >> >> >>> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>> >> >> >
>> >> >> >
>> >> >>
>> >> >
>> >> > --
>> >> > Richard Biener <rguenther@suse.de>
>> >> > SUSE / SUSE Labs
>> >> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> >> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>> >>
>> >>
>> >
>> > --
>> > Richard Biener <rguenther@suse.de>
>> > SUSE / SUSE Labs
>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>>
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-19 14:58                                           ` Jakub Jelinek
@ 2013-11-19 15:07                                             ` Sergey Ostanevich
  2013-11-19 15:08                                               ` Jakub Jelinek
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-19 15:07 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

>> > I propose the following, yet SLP have to use a NULL as a loop info
>> > which looks somewhat hacky.
>>
>> I think this is overengineering.  -fvect-cost-model will do as
>> workaround.  And -fsimd-vect-cost-model has what I consider
>> duplicate - "simd" and "vect".
>
> I think it is a good idea, though I agree about s/simd-vect/simd/ and
> I'd use VECT_COST_MODEL_DEFAULT as the default, which would mean
> just use -fvect-cost-model.

that's ok, since we'd have a way to force those 'simd' loops.

>
>> > @@ -2929,6 +2929,11 @@ vect_estimate_min_profitable_iters
>> > (loop_vec_info loop_vinfo,
>> >    /* vector version will never be profitable.  */
>> >    else
>> >      {
>> > +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
>> > +        {
>> > +          pedwarn (vect_location, 0, "Vectorization did not happen
>> > for the loop");
>> > +        }
>
> pedwarn isn't really desirable for this, you want just warning,
> but some warning you can actually also turn off.
> -Wopenmp-simd (and we'd use it also when we ignore #pragma omp declare simd
> because it wasn't useful/desirable).

consider a user is interested in enabling warning-as-error for this case?
can we disable the pedwarn the same way?

Sergos

>
>         Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-19 15:07                                             ` Sergey Ostanevich
@ 2013-11-19 15:08                                               ` Jakub Jelinek
  2013-11-19 21:08                                                 ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Jakub Jelinek @ 2013-11-19 15:08 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Richard Biener, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

On Tue, Nov 19, 2013 at 06:39:48PM +0400, Sergey Ostanevich wrote:
> > pedwarn isn't really desirable for this, you want just warning,
> > but some warning you can actually also turn off.
> > -Wopenmp-simd (and we'd use it also when we ignore #pragma omp declare simd
> > because it wasn't useful/desirable).
> 
> consider a user is interested in enabling warning-as-error for this case?

-Werror=openmp-simd will work then, this works for any named warnings.

> can we disable the pedwarn the same way?

pedwarn is for pedantic warnings, no standard says that #pragma omp simd
must be vectorized, or that #pragma omp simd or #pragma omp declare simd
is anything but an optimization hint, so pedwarn isn't what you are looking
for.

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-19 15:08                                               ` Jakub Jelinek
@ 2013-11-19 21:08                                                 ` Sergey Ostanevich
  2013-11-19 21:45                                                   ` Tobias Burnus
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-19 21:08 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

ok, got it.
I'm not sure if C/C++ and FORTRAN are enough?


        * common.opt: Added new option -fsimd-cost-model
        * tree-vectorizer.h (unlimited_cost_model): Interface update
        to rely on particular loop info
        * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
        unlimited_cost_model call according to new interface
        (vect_peeling_hash_choose_best_peeling): Ditto
        (vect_enhance_data_refs_alignment): Ditto
        * tree-vect-slp.c: Ditto
        * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
        plus issue a warning in case cost model overrides users' directive
        * c-family/c.opt: add openmp-simd warning
        * fortran/lang.opt: Ditto


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 0026683..84911a0 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -592,6 +592,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used

+Wopenmp-simd
+C C++ Var(openmp_simd) Warning
+Warn about omp simd construct is overridden by cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning
LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified
by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index d5971df..9fab3ae 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2296,6 +2296,10 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization

+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_simd_cost_model) Init(VECT_COST_MODEL_DEFAULT)
+Specifies the cost model for vectorization in loops marked with omp simd
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
vectorizer cost model %qs)

diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..3fc98a6 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard

+Wopenmp-simd
+Fortran Warning
+Warn about omp simd construct is overridden by cost model
+
 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 83d1f45..977db43 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
loop_vinfo, struct data_reference *dr,
       *new_slot = slot;
     }

-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     slot->count += VECT_MAX_COST;
 }

@@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
(loop_vec_info loop_vinfo,
    res.peel_info.dr = NULL;
    res.body_cost_vec = stmt_vector_for_cost ();

-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
      {
        res.inside_cost = INT_MAX;
        res.outside_cost = INT_MAX;
@@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
                  vectorization factor.
                  We do this automtically for cost model, since we
calculate cost
                  for every peeling option.  */
-              if (unlimited_cost_model ())
+              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                 possible_npeel_number = vf /nelements;

               /* Handle the aligned case. We may decide to align some other
@@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
               if (DR_MISALIGNMENT (dr) == 0)
                 {
                   npeel_tmp = 0;
-                  if (unlimited_cost_model ())
+                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                     possible_npeel_number++;
                 }

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 86ebbd2..d360f43 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters
(loop_vec_info loop_vinfo,
   void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);

   /* Cost model disabled.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     {
       dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
       *ret_min_profitable_niters = 0;
@@ -2929,6 +2929,12 @@ vect_estimate_min_profitable_iters
(loop_vec_info loop_vinfo,
   /* vector version will never be profitable.  */
   else
     {
+      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
+        {
+          warning (OPT_Wopenmp_simd, "Vectorization did not happen
for the loop ",
+                   "labeled as simd");
+        }
+
       if (dump_enabled_p ())
         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
                         "cost model: the vector iteration cost = %d "
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 247bdfd..4b25964 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2171,7 +2171,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
     }

   /* Cost model: check if the vectorization is worthwhile.  */
-  if (!unlimited_cost_model ()
+  if (!unlimited_cost_model (NULL)
       && !vect_bb_vectorization_profitable_p (bb_vinfo))
     {
       if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index a6c5b59..fd255db 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -919,9 +919,12 @@ known_alignment_for_access_p (struct
data_reference *data_ref_info)

 /* Return true if the vect cost model is unlimited.  */
 static inline bool
-unlimited_cost_model ()
+unlimited_cost_model (loop_p loop)
 {
-  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
+  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED
+          || (loop != NULL
+              && loop->force_vect
+              && flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED));
 }

 /* Source location */

On Tue, Nov 19, 2013 at 6:42 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Nov 19, 2013 at 06:39:48PM +0400, Sergey Ostanevich wrote:
>> > pedwarn isn't really desirable for this, you want just warning,
>> > but some warning you can actually also turn off.
>> > -Wopenmp-simd (and we'd use it also when we ignore #pragma omp declare simd
>> > because it wasn't useful/desirable).
>>
>> consider a user is interested in enabling warning-as-error for this case?
>
> -Werror=openmp-simd will work then, this works for any named warnings.
>
>> can we disable the pedwarn the same way?
>
> pedwarn is for pedantic warnings, no standard says that #pragma omp simd
> must be vectorized, or that #pragma omp simd or #pragma omp declare simd
> is anything but an optimization hint, so pedwarn isn't what you are looking
> for.
>
>         Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-19 21:08                                                 ` Sergey Ostanevich
@ 2013-11-19 21:45                                                   ` Tobias Burnus
  2013-11-20 14:05                                                     ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Tobias Burnus @ 2013-11-19 21:45 UTC (permalink / raw)
  To: Sergey Ostanevich, Jakub Jelinek
  Cc: Richard Biener, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

I have some small comments to the patch:

* You should also update gcc/doc/invoke.texi

Sergey Ostanevich wrote:
> index 0026683..84911a0 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
...
> +Wopenmp-simd
> +C C++ Var(openmp_simd) Warning
> +Warn about omp simd construct is overridden by cost model

> --- a/gcc/fortran/lang.opt
> +++ b/gcc/fortran/lang.opt
...
> +Wopenmp-simd
> +Fortran Warning
> +Warn about omp simd construct is overridden by cost model

As the option files get merged, using

Wopenmp-simd
Fortran
; Documented in C

is sufficient.


> +fsimd-cost-model=
> +Common Joined RejectNegative Enum(vect_cost_model)
> Var(flag_simd_cost_model) Init(VECT_COST_MODEL_DEFAULT)
> +Specifies the cost model for vectorization in loops marked with omp simd

I wonder whether we need to care about Cilk Plus' "#pragma simd" in this 
summary.



> +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> +        {
> +          warning (OPT_Wopenmp_simd, "Vectorization did not happen
> for the loop ",
> +                   "labeled as simd");
> +        }
> +

The warning line is too long.


Tobias

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-19 21:45                                                   ` Tobias Burnus
@ 2013-11-20 14:05                                                     ` Sergey Ostanevich
  2013-11-20 15:11                                                       ` Richard Biener
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-20 14:05 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Jakub Jelinek, Richard Biener, Richard Henderson,
	Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

Thanks for comments, hope I got all of em.
Note: I used a LOOP_VINFO_LOC (loop_vinfo) to print the loop location
but it appears to be 0, so the output is somewhat lousy. The global
vect_location points somewhere inside the loop, which is not that better.
Shall we address this separately?

Sergos

        * common.opt: Added new option -fsimd-cost-model.
        * tree-vectorizer.h (unlimited_cost_model): Interface update
        to rely on particular loop info.
        * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
        unlimited_cost_model call according to new interface.
        (vect_peeling_hash_choose_best_peeling): Ditto.
        (vect_enhance_data_refs_alignment): Ditto.
        * tree-vect-slp.c: Ditto.
        * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
        plus issue a warning in case cost model overrides users' directive.
        * c-family/c.opt: add openmp-simd warning.
        * fortran/lang.opt: Ditto.
        * doc/invoke.texi: Added new openmp-simd warning.



diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 0026683..a85a8ad 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -592,6 +592,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used

+Wopenmp-simd
+C C++ Var(openmp_simd) Warning EnabledBy(Wall)
+Warn about simd directive is overridden by vectorizer cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning
LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified
by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index d5971df..2b0e9e6 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2296,6 +2296,10 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization

+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the vectorization cost model for code marked with simd directive
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
vectorizer cost model %qs)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c250385..050bd44 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wlogical-op -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces
-Wmissing-field-initializers @gol
 -Wmissing-include-dirs @gol
--Wno-multichar  -Wnonnull  -Wno-overflow @gol
+-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
@@ -3318,6 +3318,7 @@ Options} and @ref{Objective-C and Objective-C++
Dialect Options}.
 -Wmaybe-uninitialized @gol
 -Wmissing-braces @r{(only for C/ObjC)} @gol
 -Wnonnull  @gol
+-Wopenmp-simd @gol
 -Wparentheses  @gol
 -Wpointer-sign  @gol
 -Wreorder   @gol
@@ -4804,6 +4805,10 @@ attribute.
 @opindex Woverflow
 Do not warn about compile-time overflow in constant expressions.

+@item -Wopenmp-simd
+@opindex Wopenm-simd
+Warn if vectorizer cost model overrides simd directive from user.
+
 @item -Woverride-init @r{(C and Objective-C only)}
 @opindex Woverride-init
 @opindex Wno-override-init
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..b43c48c 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard

+Wopenmp-simd
+Fortran Warning
+; Documented in C
+
 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 83d1f45..977db43 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
loop_vinfo, struct data_reference *dr,
       *new_slot = slot;
     }

-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     slot->count += VECT_MAX_COST;
 }

@@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
(loop_vec_info loop_vinfo,
    res.peel_info.dr = NULL;
    res.body_cost_vec = stmt_vector_for_cost ();

-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
      {
        res.inside_cost = INT_MAX;
        res.outside_cost = INT_MAX;
@@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
                  vectorization factor.
                  We do this automtically for cost model, since we
calculate cost
                  for every peeling option.  */
-              if (unlimited_cost_model ())
+              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                 possible_npeel_number = vf /nelements;

               /* Handle the aligned case. We may decide to align some other
@@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
               if (DR_MISALIGNMENT (dr) == 0)
                 {
                   npeel_tmp = 0;
-                  if (unlimited_cost_model ())
+                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                     possible_npeel_number++;
                 }

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 86ebbd2..1fd28e3 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters
(loop_vec_info loop_vinfo,
   void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);

   /* Cost model disabled.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     {
       dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
       *ret_min_profitable_niters = 0;
@@ -2929,6 +2929,13 @@ vect_estimate_min_profitable_iters
(loop_vec_info loop_vinfo,
   /* vector version will never be profitable.  */
   else
     {
+      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
+        {
+          warning_at (LOOP_VINFO_LOC (loop_vinfo), OPT_Wopenmp_simd,
+                      "Vectorization did not happen for "
+                      "the loop labeled as simd.");
+        }
+
       if (dump_enabled_p ())
         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
  "cost model: the vector iteration cost = %d "
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 247bdfd..4b25964 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2171,7 +2171,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
     }

   /* Cost model: check if the vectorization is worthwhile.  */
-  if (!unlimited_cost_model ()
+  if (!unlimited_cost_model (NULL)
       && !vect_bb_vectorization_profitable_p (bb_vinfo))
     {
       if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index a6c5b59..fd255db 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -919,9 +919,12 @@ known_alignment_for_access_p (struct
data_reference *data_ref_info)

 /* Return true if the vect cost model is unlimited.  */
 static inline bool
-unlimited_cost_model ()
+unlimited_cost_model (loop_p loop)
 {
-  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
+  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED
+          || (loop != NULL
+              && loop->force_vect
+              && flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED));
 }

 /* Source location */


On Wed, Nov 20, 2013 at 12:14 AM, Tobias Burnus <burnus@net-b.de> wrote:
> I have some small comments to the patch:
>
> * You should also update gcc/doc/invoke.texi
>
>
> Sergey Ostanevich wrote:
>>
>> index 0026683..84911a0 100644
>> --- a/gcc/c-family/c.opt
>> +++ b/gcc/c-family/c.opt
>
> ...
>
>> +Wopenmp-simd
>> +C C++ Var(openmp_simd) Warning
>> +Warn about omp simd construct is overridden by cost model
>
>
>> --- a/gcc/fortran/lang.opt
>> +++ b/gcc/fortran/lang.opt
>
> ...
>
>> +Wopenmp-simd
>> +Fortran Warning
>> +Warn about omp simd construct is overridden by cost model
>
>
> As the option files get merged, using
>
> Wopenmp-simd
> Fortran
> ; Documented in C
>
> is sufficient.
>
>
>
>> +fsimd-cost-model=
>> +Common Joined RejectNegative Enum(vect_cost_model)
>> Var(flag_simd_cost_model) Init(VECT_COST_MODEL_DEFAULT)
>> +Specifies the cost model for vectorization in loops marked with omp simd
>
>
> I wonder whether we need to care about Cilk Plus' "#pragma simd" in this
> summary.
>
>
>
>
>> +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
>> +        {
>> +          warning (OPT_Wopenmp_simd, "Vectorization did not happen
>> for the loop ",
>> +                   "labeled as simd");
>> +        }
>> +
>
>
> The warning line is too long.
>
>
> Tobias

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-20 14:05                                                     ` Sergey Ostanevich
@ 2013-11-20 15:11                                                       ` Richard Biener
  2013-11-20 15:44                                                         ` Jakub Jelinek
  0 siblings, 1 reply; 44+ messages in thread
From: Richard Biener @ 2013-11-20 15:11 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Tobias Burnus, Jakub Jelinek, Richard Henderson, Yuri Rumyantsev,
	gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

On Wed, 20 Nov 2013, Sergey Ostanevich wrote:

> Thanks for comments, hope I got all of em.
> Note: I used a LOOP_VINFO_LOC (loop_vinfo) to print the loop location
> but it appears to be 0, so the output is somewhat lousy. The global
> vect_location points somewhere inside the loop, which is not that better.
> Shall we address this separately?

Use vect_location instead.

Note that c-family/ and fortran/ have their own ChangeLog file
and files there don't have a prefix.

Ok with the change to use vect_location.

Thanks,
Richard.

> Sergos
> 
>         * common.opt: Added new option -fsimd-cost-model.
>         * tree-vectorizer.h (unlimited_cost_model): Interface update
>         to rely on particular loop info.
>         * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
>         unlimited_cost_model call according to new interface.
>         (vect_peeling_hash_choose_best_peeling): Ditto.
>         (vect_enhance_data_refs_alignment): Ditto.
>         * tree-vect-slp.c: Ditto.
>         * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
>         plus issue a warning in case cost model overrides users' directive.
>         * c-family/c.opt: add openmp-simd warning.
>         * fortran/lang.opt: Ditto.
>         * doc/invoke.texi: Added new openmp-simd warning.
> 
> 
> 
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 0026683..a85a8ad 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -592,6 +592,10 @@ Wold-style-definition
>  C ObjC Var(warn_old_style_definition) Warning
>  Warn if an old-style parameter definition is used
> 
> +Wopenmp-simd
> +C C++ Var(openmp_simd) Warning EnabledBy(Wall)
> +Warn about simd directive is overridden by vectorizer cost model
> +
>  Woverlength-strings
>  C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning
> LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
>  Warn if a string is longer than the maximum portable length specified
> by the standard
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d5971df..2b0e9e6 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2296,6 +2296,10 @@ fvect-cost-model=
>  Common Joined RejectNegative Enum(vect_cost_model)
> Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
>  Specifies the cost model for vectorization
> 
> +fsimd-cost-model=
> +Common Joined RejectNegative Enum(vect_cost_model)
> Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
> +Specifies the vectorization cost model for code marked with simd directive
> +
>  Enum
>  Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
> vectorizer cost model %qs)
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index c250385..050bd44 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
>  -Wlogical-op -Wlong-long @gol
>  -Wmain -Wmaybe-uninitialized -Wmissing-braces
> -Wmissing-field-initializers @gol
>  -Wmissing-include-dirs @gol
> --Wno-multichar  -Wnonnull  -Wno-overflow @gol
> +-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
>  -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
>  -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
>  -Wpointer-arith  -Wno-pointer-to-int-cast @gol
> @@ -3318,6 +3318,7 @@ Options} and @ref{Objective-C and Objective-C++
> Dialect Options}.
>  -Wmaybe-uninitialized @gol
>  -Wmissing-braces @r{(only for C/ObjC)} @gol
>  -Wnonnull  @gol
> +-Wopenmp-simd @gol
>  -Wparentheses  @gol
>  -Wpointer-sign  @gol
>  -Wreorder   @gol
> @@ -4804,6 +4805,10 @@ attribute.
>  @opindex Woverflow
>  Do not warn about compile-time overflow in constant expressions.
> 
> +@item -Wopenmp-simd
> +@opindex Wopenm-simd
> +Warn if vectorizer cost model overrides simd directive from user.
> +
>  @item -Woverride-init @r{(C and Objective-C only)}
>  @opindex Woverride-init
>  @opindex Wno-override-init
> diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
> index 5e09cbd..b43c48c 100644
> --- a/gcc/fortran/lang.opt
> +++ b/gcc/fortran/lang.opt
> @@ -257,6 +257,10 @@ Wintrinsics-std
>  Fortran Warning
>  Warn on intrinsics not part of the selected standard
> 
> +Wopenmp-simd
> +Fortran Warning
> +; Documented in C
> +
>  Wreal-q-constant
>  Fortran Warning
>  Warn about real-literal-constants with 'q' exponent-letter
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index 83d1f45..977db43 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
> loop_vinfo, struct data_reference *dr,
>        *new_slot = slot;
>      }
> 
> -  if (!supportable_dr_alignment && unlimited_cost_model ())
> +  if (!supportable_dr_alignment
> +      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>      slot->count += VECT_MAX_COST;
>  }
> 
> @@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
> (loop_vec_info loop_vinfo,
>     res.peel_info.dr = NULL;
>     res.body_cost_vec = stmt_vector_for_cost ();
> 
> -   if (!unlimited_cost_model ())
> +   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>       {
>         res.inside_cost = INT_MAX;
>         res.outside_cost = INT_MAX;
> @@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
> loop_vinfo)
>                   vectorization factor.
>                   We do this automtically for cost model, since we
> calculate cost
>                   for every peeling option.  */
> -              if (unlimited_cost_model ())
> +              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>                  possible_npeel_number = vf /nelements;
> 
>                /* Handle the aligned case. We may decide to align some other
> @@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
> loop_vinfo)
>                if (DR_MISALIGNMENT (dr) == 0)
>                  {
>                    npeel_tmp = 0;
> -                  if (unlimited_cost_model ())
> +                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>                      possible_npeel_number++;
>                  }
> 
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index 86ebbd2..1fd28e3 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters
> (loop_vec_info loop_vinfo,
>    void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
> 
>    /* Cost model disabled.  */
> -  if (unlimited_cost_model ())
> +  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>      {
>        dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
>        *ret_min_profitable_niters = 0;
> @@ -2929,6 +2929,13 @@ vect_estimate_min_profitable_iters
> (loop_vec_info loop_vinfo,
>    /* vector version will never be profitable.  */
>    else
>      {
> +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> +        {
> +          warning_at (LOOP_VINFO_LOC (loop_vinfo), OPT_Wopenmp_simd,
> +                      "Vectorization did not happen for "
> +                      "the loop labeled as simd.");
> +        }
> +
>        if (dump_enabled_p ())
>          dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>   "cost model: the vector iteration cost = %d "
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 247bdfd..4b25964 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -2171,7 +2171,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
>      }
> 
>    /* Cost model: check if the vectorization is worthwhile.  */
> -  if (!unlimited_cost_model ()
> +  if (!unlimited_cost_model (NULL)
>        && !vect_bb_vectorization_profitable_p (bb_vinfo))
>      {
>        if (dump_enabled_p ())
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index a6c5b59..fd255db 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -919,9 +919,12 @@ known_alignment_for_access_p (struct
> data_reference *data_ref_info)
> 
>  /* Return true if the vect cost model is unlimited.  */
>  static inline bool
> -unlimited_cost_model ()
> +unlimited_cost_model (loop_p loop)
>  {
> -  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
> +  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED
> +          || (loop != NULL
> +              && loop->force_vect
> +              && flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED));
>  }
> 
>  /* Source location */
> 
> 
> On Wed, Nov 20, 2013 at 12:14 AM, Tobias Burnus <burnus@net-b.de> wrote:
> > I have some small comments to the patch:
> >
> > * You should also update gcc/doc/invoke.texi
> >
> >
> > Sergey Ostanevich wrote:
> >>
> >> index 0026683..84911a0 100644
> >> --- a/gcc/c-family/c.opt
> >> +++ b/gcc/c-family/c.opt
> >
> > ...
> >
> >> +Wopenmp-simd
> >> +C C++ Var(openmp_simd) Warning
> >> +Warn about omp simd construct is overridden by cost model
> >
> >
> >> --- a/gcc/fortran/lang.opt
> >> +++ b/gcc/fortran/lang.opt
> >
> > ...
> >
> >> +Wopenmp-simd
> >> +Fortran Warning
> >> +Warn about omp simd construct is overridden by cost model
> >
> >
> > As the option files get merged, using
> >
> > Wopenmp-simd
> > Fortran
> > ; Documented in C
> >
> > is sufficient.
> >
> >
> >
> >> +fsimd-cost-model=
> >> +Common Joined RejectNegative Enum(vect_cost_model)
> >> Var(flag_simd_cost_model) Init(VECT_COST_MODEL_DEFAULT)
> >> +Specifies the cost model for vectorization in loops marked with omp simd
> >
> >
> > I wonder whether we need to care about Cilk Plus' "#pragma simd" in this
> > summary.
> >
> >
> >
> >
> >> +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> >> +        {
> >> +          warning (OPT_Wopenmp_simd, "Vectorization did not happen
> >> for the loop ",
> >> +                   "labeled as simd");
> >> +        }
> >> +
> >
> >
> > The warning line is too long.
> >
> >
> > Tobias
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-20 15:11                                                       ` Richard Biener
@ 2013-11-20 15:44                                                         ` Jakub Jelinek
  2013-11-20 16:11                                                           ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Jakub Jelinek @ 2013-11-20 15:44 UTC (permalink / raw)
  To: Richard Biener
  Cc: Sergey Ostanevich, Tobias Burnus, Richard Henderson,
	Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

On Wed, Nov 20, 2013 at 02:59:21PM +0100, Richard Biener wrote:
> > --- a/gcc/c-family/c.opt
> > +++ b/gcc/c-family/c.opt
> > @@ -592,6 +592,10 @@ Wold-style-definition
> >  C ObjC Var(warn_old_style_definition) Warning
> >  Warn if an old-style parameter definition is used
> > 
> > +Wopenmp-simd
> > +C C++ Var(openmp_simd) Warning EnabledBy(Wall)

Please use Var(warn_openmp_simd) here.

> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -2296,6 +2296,10 @@ fvect-cost-model=
> >  Common Joined RejectNegative Enum(vect_cost_model)
> > Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
> >  Specifies the cost model for vectorization
> > 
> > +fsimd-cost-model=
> > +Common Joined RejectNegative Enum(vect_cost_model)
> > Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
> > +Specifies the vectorization cost model for code marked with simd directive
> > +
> >  Enum
> >  Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
> > vectorizer cost model %qs)

I'd say you want to add
EnumValue
Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT)
here.

> > @@ -2929,6 +2929,13 @@ vect_estimate_min_profitable_iters
> > (loop_vec_info loop_vinfo,
> >    /* vector version will never be profitable.  */
> >    else
> >      {
> > +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> > +        {
> > +          warning_at (LOOP_VINFO_LOC (loop_vinfo), OPT_Wopenmp_simd,
> > +                      "Vectorization did not happen for "
> > +                      "the loop labeled as simd.");

No {} around single stmt then body.  Also, diagnostic messages
don't start with a capital letter and don't end with dot.
So
		"vectorization did not happen for "
		"a simd loop"
or so.

> >  /* Return true if the vect cost model is unlimited.  */
> >  static inline bool
> > -unlimited_cost_model ()
> > +unlimited_cost_model (loop_p loop)
> >  {
> > -  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
> > +  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED
> > +          || (loop != NULL
> > +              && loop->force_vect
> > +              && flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED));
> >  }

IMNSHO this should instead do:
  if (loop != NULL && loop->force_vect
      && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
    return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
so, if user said that -fsimd-cost-model=default, then it should honor
-fvect-cost-model.  And, IMHO that should be the default, but I don't
feel strongly about that.

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-20 15:44                                                         ` Jakub Jelinek
@ 2013-11-20 16:11                                                           ` Sergey Ostanevich
  2013-11-20 21:27                                                             ` Tobias Burnus
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-20 16:11 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Tobias Burnus, Richard Henderson,
	Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

Updated as per Richard and Jakub feedback - assuming the default
for simd-cost-model is unlmited by default.
Richard - was you Ok with it?

Sergos

        * common.opt: Added new option -fsimd-cost-model.
        * tree-vectorizer.h (unlimited_cost_model): Interface update
        to rely on particular loop info.
        * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
        unlimited_cost_model call according to new interface.
        (vect_peeling_hash_choose_best_peeling): Ditto.
        (vect_enhance_data_refs_alignment): Ditto.
        * tree-vect-slp.c: Ditto.
        * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
        plus issue a warning in case cost model overrides users' directive.
        * c.opt: add openmp-simd warning.
        * lang.opt: Ditto.
        * doc/invoke.texi: Added new openmp-simd warning.



diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 0026683..6173013 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -592,6 +592,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used

+Wopenmp-simd
+C C++ Var(warn_openmp_simd) Warning EnabledBy(Wall)
+Warn about simd directive is overridden by vectorizer cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning
LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified
by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index d5971df..6a40a5d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2296,10 +2296,17 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization

+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the vectorization cost model for code marked with simd directive
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
vectorizer cost model %qs)

 EnumValue
+Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT)
+
+EnumValue
 Enum(vect_cost_model) String(unlimited) Value(VECT_COST_MODEL_UNLIMITED)

 EnumValue
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c250385..050bd44 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wlogical-op -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces
-Wmissing-field-initializers @gol
 -Wmissing-include-dirs @gol
--Wno-multichar  -Wnonnull  -Wno-overflow @gol
+-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
@@ -3318,6 +3318,7 @@ Options} and @ref{Objective-C and Objective-C++
Dialect Options}.
 -Wmaybe-uninitialized @gol
 -Wmissing-braces @r{(only for C/ObjC)} @gol
 -Wnonnull  @gol
+-Wopenmp-simd @gol
 -Wparentheses  @gol
 -Wpointer-sign  @gol
 -Wreorder   @gol
@@ -4804,6 +4805,10 @@ attribute.
 @opindex Woverflow
 Do not warn about compile-time overflow in constant expressions.

+@item -Wopenmp-simd
+@opindex Wopenm-simd
+Warn if vectorizer cost model overrides simd directive from user.
+
 @item -Woverride-init @r{(C and Objective-C only)}
 @opindex Woverride-init
 @opindex Wno-override-init
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..b43c48c 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard

+Wopenmp-simd
+Fortran Warning
+; Documented in C
+
 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 83d1f45..977db43 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
loop_vinfo, struct data_reference *dr,
       *new_slot = slot;
     }

-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     slot->count += VECT_MAX_COST;
 }

@@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
(loop_vec_info loop_vinfo,
    res.peel_info.dr = NULL;
    res.body_cost_vec = stmt_vector_for_cost ();

-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
      {
        res.inside_cost = INT_MAX;
        res.outside_cost = INT_MAX;
@@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
                  vectorization factor.
                  We do this automtically for cost model, since we
calculate cost
                  for every peeling option.  */
-              if (unlimited_cost_model ())
+              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                 possible_npeel_number = vf /nelements;

               /* Handle the aligned case. We may decide to align some other
@@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
               if (DR_MISALIGNMENT (dr) == 0)
                 {
                   npeel_tmp = 0;
-                  if (unlimited_cost_model ())
+                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                     possible_npeel_number++;
                 }

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 86ebbd2..c11d86d 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters
(loop_vec_info loop_vinfo,
   void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);

   /* Cost model disabled.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     {
       dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
       *ret_min_profitable_niters = 0;
@@ -2929,6 +2929,10 @@ vect_estimate_min_profitable_iters
(loop_vec_info loop_vinfo,
   /* vector version will never be profitable.  */
   else
     {
+      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
+        warning_at (vect_location, OPT_Wopenmp_simd, "vectorization "
+                    "did not happen for a simd loop");
+
       if (dump_enabled_p ())
         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
  "cost model: the vector iteration cost = %d "
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 247bdfd..4b25964 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2171,7 +2171,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
     }

   /* Cost model: check if the vectorization is worthwhile.  */
-  if (!unlimited_cost_model ()
+  if (!unlimited_cost_model (NULL)
       && !vect_bb_vectorization_profitable_p (bb_vinfo))
     {
       if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index a6c5b59..56ad92c 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -919,9 +919,12 @@ known_alignment_for_access_p (struct
data_reference *data_ref_info)

 /* Return true if the vect cost model is unlimited.  */
 static inline bool
-unlimited_cost_model ()
+unlimited_cost_model (loop_p loop)
 {
-  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
+  if (loop != NULL && loop->force_vect
+      && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
+    return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
+  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
 }

 /* Source location */

On Wed, Nov 20, 2013 at 6:14 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Nov 20, 2013 at 02:59:21PM +0100, Richard Biener wrote:
>> > --- a/gcc/c-family/c.opt
>> > +++ b/gcc/c-family/c.opt
>> > @@ -592,6 +592,10 @@ Wold-style-definition
>> >  C ObjC Var(warn_old_style_definition) Warning
>> >  Warn if an old-style parameter definition is used
>> >
>> > +Wopenmp-simd
>> > +C C++ Var(openmp_simd) Warning EnabledBy(Wall)
>
> Please use Var(warn_openmp_simd) here.
>
>> > --- a/gcc/common.opt
>> > +++ b/gcc/common.opt
>> > @@ -2296,6 +2296,10 @@ fvect-cost-model=
>> >  Common Joined RejectNegative Enum(vect_cost_model)
>> > Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
>> >  Specifies the cost model for vectorization
>> >
>> > +fsimd-cost-model=
>> > +Common Joined RejectNegative Enum(vect_cost_model)
>> > Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
>> > +Specifies the vectorization cost model for code marked with simd directive
>> > +
>> >  Enum
>> >  Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
>> > vectorizer cost model %qs)
>
> I'd say you want to add
> EnumValue
> Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT)
> here.
>
>> > @@ -2929,6 +2929,13 @@ vect_estimate_min_profitable_iters
>> > (loop_vec_info loop_vinfo,
>> >    /* vector version will never be profitable.  */
>> >    else
>> >      {
>> > +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
>> > +        {
>> > +          warning_at (LOOP_VINFO_LOC (loop_vinfo), OPT_Wopenmp_simd,
>> > +                      "Vectorization did not happen for "
>> > +                      "the loop labeled as simd.");
>
> No {} around single stmt then body.  Also, diagnostic messages
> don't start with a capital letter and don't end with dot.
> So
>                 "vectorization did not happen for "
>                 "a simd loop"
> or so.
>
>> >  /* Return true if the vect cost model is unlimited.  */
>> >  static inline bool
>> > -unlimited_cost_model ()
>> > +unlimited_cost_model (loop_p loop)
>> >  {
>> > -  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
>> > +  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED
>> > +          || (loop != NULL
>> > +              && loop->force_vect
>> > +              && flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED));
>> >  }
>
> IMNSHO this should instead do:
>   if (loop != NULL && loop->force_vect
>       && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
>     return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
>   return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
> so, if user said that -fsimd-cost-model=default, then it should honor
> -fvect-cost-model.  And, IMHO that should be the default, but I don't
> feel strongly about that.
>
>         Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-20 16:11                                                           ` Sergey Ostanevich
@ 2013-11-20 21:27                                                             ` Tobias Burnus
  2013-11-21 17:55                                                               ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Tobias Burnus @ 2013-11-20 21:27 UTC (permalink / raw)
  To: Sergey Ostanevich, Jakub Jelinek
  Cc: Richard Biener, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

First,

Sergey Ostanevich wrote:
> +      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> +        warning_at (vect_location, OPT_Wopenmp_simd, "vectorization "
> +                    "did not happen for a simd loop");
> +

When I understand the patch correctly, the warning is shown in two cases:
a) When the loop could be vectorized but the cost model prevented it
b) When the loop couldn't be vectorized because of other reasons (e.g. 
not vectorizable because of conditional loop exits, incomplete 
vectorization support by the compiler etc.)

Do I correctly understand the warning? I am asking because the *opt and 
*texi wording suggests that only (a) is the case. - I cannot test as the 
patch cannot be applied with heavy editing (removal of additional line 
breaks, taking care of tabs converted into spaces).

Regarding the warning, I think it sounds a bit colloquial and as if the 
location information is not available. What do you think of "loop with 
simd directive not vectorized" or concise not fully correct: "simd loop 
not vectorized"?

Additionally, shouldn't that be guarded by "if (warn_openmp_simd &&"? 
Otherwise the flag status isn't used at all in the whole patch.

> +Wopenmp-simd
> +C C++ Var(warn_openmp_simd) Warning EnabledBy(Wall)
> +Warn about simd directive is overridden by vectorizer cost model

Wording wise, I'd prefer something like:
"Warn if an simd directive is overridden by the vectorizer cost model"

(Or is it "a simd"? Where are the native speakers when one needs them?)

However, in light of my question above, shouldn't it be "Warn if a loop 
with simd directive is not vectorized"?

> +fsimd-cost-model=
> +Common Joined RejectNegative Enum(vect_cost_model)
> Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
> +Specifies the vectorization cost model for code marked with simd directive

I think an article is lacking before "simd".

> +@item -Wopenmp-simd
> +@opindex Wopenm-simd
> +Warn if vectorizer cost model overrides simd directive from user.

I think that can be expanded a bit. One could also mention OpenMP/Cilk 
Plus explicitly. Maybe like:  "Warn if the vectorizer cost model 
overrides the OpenMP and Cilk Plus simd directives of the user."

Or if my reading above is correct, how about something like: "Warn if a 
loop with OpenMP or Cilk Plus simd directive is not vectorized. If only 
the cost model prevented the vectorization, the 
@option{-fsimd-cost-model} option can be used to force the vectorization."

Which brings me to my next point: -fvect-cost-model= is not documented. 
I think some words would be helpful, especially about the valid 
arguments, the default and how it interacts with -fvect-cost-model=.

> --- a/gcc/fortran/lang.opt
> +++ b/gcc/fortran/lang.opt
> +Wopenmp-simd
> +Fortran Warning
> +; Documented in C
("Warning" is also not needed as it is taken from c-family/*opt, but it 
shouldn't harm either.)

Tobias

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-20 21:27                                                             ` Tobias Burnus
@ 2013-11-21 17:55                                                               ` Sergey Ostanevich
  2013-11-25 17:16                                                                 ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-21 17:55 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Jakub Jelinek, Richard Biener, Richard Henderson,
	Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

[-- Attachment #1: Type: text/plain, Size: 4138 bytes --]

Tobias,


> When I understand the patch correctly, the warning is shown in two cases:
> a) When the loop could be vectorized but the cost model prevented it
> b) When the loop couldn't be vectorized because of other reasons (e.g. not
> vectorizable because of conditional loop exits, incomplete vectorization
> support by the compiler etc.)
>
> Do I correctly understand the warning? I am asking because the *opt and
> *texi wording suggests that only (a) is the case. - I cannot test as the
> patch cannot be applied with heavy editing (removal of additional line
> breaks, taking care of tabs converted into spaces).

I believe it's only for a) case, since warning stays along with the cost
model report that says only about relative scalar and vector costs of
iteration. The case of exits and vectorization capabilities is handled earlier,
since we have some vector code here.

Will try to attach the patch instead of copy-paste here.

>
> Regarding the warning, I think it sounds a bit colloquial and as if the
> location information is not available. What do you think of "loop with simd
> directive not vectorized" or concise not fully correct: "simd loop not
> vectorized"?

took one of yours.

>
> Additionally, shouldn't that be guarded by "if (warn_openmp_simd &&"?
> Otherwise the flag status isn't used at all in the whole patch.

This is strange to me, since it worked as I pass the OPT_Wopenmp_simd
to the warning_at (). It does:
   show warinig with -Wopenmp-simd
   doesn't show warning with -Wall -Wno-openmp-simd

>
>> +Wopenmp-simd
>> +C C++ Var(warn_openmp_simd) Warning EnabledBy(Wall)
>> +Warn about simd directive is overridden by vectorizer cost model
>
>
> Wording wise, I'd prefer something like:
> "Warn if an simd directive is overridden by the vectorizer cost model"
>
> (Or is it "a simd"? Where are the native speakers when one needs them?)

damn, right! I believe 'a' since simd starts with consonant.

>
> However, in light of my question above, shouldn't it be "Warn if a loop with
> simd directive is not vectorized"?
>
>
>
>> +fsimd-cost-model=
>> +Common Joined RejectNegative Enum(vect_cost_model)
>> Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
>> +Specifies the vectorization cost model for code marked with simd
>> directive
>
>
> I think an article is lacking before "simd".

done.

>
>
>> +@item -Wopenmp-simd
>> +@opindex Wopenm-simd
>> +Warn if vectorizer cost model overrides simd directive from user.
>
>
> I think that can be expanded a bit. One could also mention OpenMP/Cilk Plus
> explicitly. Maybe like:  "Warn if the vectorizer cost model overrides the
> OpenMP and Cilk Plus simd directives of the user."
>

done.

> Or if my reading above is correct, how about something like: "Warn if a loop
> with OpenMP or Cilk Plus simd directive is not vectorized. If only the cost
> model prevented the vectorization, the @option{-fsimd-cost-model} option can
> be used to force the vectorization."
>
> Which brings me to my next point: -fvect-cost-model= is not documented. I
> think some words would be helpful, especially about the valid arguments, the
> default and how it interacts with -fvect-cost-model=.

done.

>
>
>> --- a/gcc/fortran/lang.opt
>> +++ b/gcc/fortran/lang.opt
>>
>> +Wopenmp-simd
>> +Fortran Warning
>> +; Documented in C
>
> ("Warning" is also not needed as it is taken from c-family/*opt, but it
> shouldn't harm either.)

done.

Sergos

        * common.opt: Added new option -fsimd-cost-model.
        * tree-vectorizer.h (unlimited_cost_model): Interface update
        to rely on particular loop info.
        * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
        unlimited_cost_model call according to new interface.
        (vect_peeling_hash_choose_best_peeling): Ditto.
        (vect_enhance_data_refs_alignment): Ditto.
        * tree-vect-slp.c: Ditto.
        * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
        plus issue a warning in case cost model overrides users' directive.
        * c.opt: add openmp-simd warning.
        * lang.opt: Ditto.
        * doc/invoke.texi: Added new openmp-simd warning.

[-- Attachment #2: patch3 --]
[-- Type: application/octet-stream, Size: 7746 bytes --]

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 0026683..93e8cb5 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -592,6 +592,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used
 
+Wopenmp-simd
+C C++ Var(warn_openmp_simd) Warning EnabledBy(Wall)
+Warn if a simd directive is overridden by the vectorizer cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index d5971df..4f0892b 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2296,10 +2296,17 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model) Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization
  
+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model) Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the vectorization cost model for code marked with a simd directive
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown vectorizer cost model %qs)
 
 EnumValue
+Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT)
+
+EnumValue
 Enum(vect_cost_model) String(unlimited) Value(VECT_COST_MODEL_UNLIMITED)
 
 EnumValue
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c250385..e5efc62 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wlogical-op -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces  -Wmissing-field-initializers @gol
 -Wmissing-include-dirs @gol
--Wno-multichar  -Wnonnull  -Wno-overflow @gol
+-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
@@ -3318,6 +3318,7 @@ Options} and @ref{Objective-C and Objective-C++ Dialect Options}.
 -Wmaybe-uninitialized @gol
 -Wmissing-braces @r{(only for C/ObjC)} @gol
 -Wnonnull  @gol
+-Wopenmp-simd @gol
 -Wparentheses  @gol
 -Wpointer-sign  @gol
 -Wreorder   @gol
@@ -4804,6 +4805,12 @@ attribute.
 @opindex Woverflow
 Do not warn about compile-time overflow in constant expressions.
 
+@item -Wopenmp-simd
+@opindex Wopenm-simd
+Warn if the vectorizer cost model overrides the OpenMP or the Cilk Plus
+simd directive set by user. The @option{-fsimd-cost-model=unlimited} can 
+be used to relax the cost model.
+
 @item -Woverride-init @r{(C and Objective-C only)}
 @opindex Woverride-init
 @opindex Wno-override-init
@@ -8050,6 +8057,15 @@ is equal to the @code{dynamic} model.
 The default cost model depends on other optimization flags and is
 either @code{dynamic} or @code{cheap}.
 
+@item -fsimd-cost-model=@var{model}
+@opindex fsimd-cost-model
+Alter the cost model used for vectorization of loops marked with the OpenMP
+or Cilk Plus simd directive. The @var{model} argument should be one of
+@code{unlimited}, @code{dynamic}, @code{cheap} or @code{default}. The
+@code{default} model means to reuse model defined by @option{fvect-cost-model}.
+All other values of @var{model} have the same meaning as described in
+@option{fvect-cost-model}.
+
 @item -ftree-vrp
 @opindex ftree-vrp
 Perform Value Range Propagation on trees.  This is similar to the
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..0d328c8 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard
 
+Wopenmp-simd
+Fortran
+; Documented in C
+
 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 83d1f45..977db43 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info loop_vinfo, struct data_reference *dr,
       *new_slot = slot;
     }
 
-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     slot->count += VECT_MAX_COST;
 }
 
@@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling (loop_vec_info loop_vinfo,
    res.peel_info.dr = NULL;
    res.body_cost_vec = stmt_vector_for_cost ();
 
-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
      {
        res.inside_cost = INT_MAX;
        res.outside_cost = INT_MAX;
@@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
                  vectorization factor.
                  We do this automtically for cost model, since we calculate cost
                  for every peeling option.  */
-              if (unlimited_cost_model ())
+              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                 possible_npeel_number = vf /nelements;
 
               /* Handle the aligned case. We may decide to align some other
@@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
               if (DR_MISALIGNMENT (dr) == 0)
                 {
                   npeel_tmp = 0;
-                  if (unlimited_cost_model ())
+                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                     possible_npeel_number++;
                 }
 
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 86ebbd2..8868191 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
   void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
 
   /* Cost model disabled.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     {
       dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
       *ret_min_profitable_niters = 0;
@@ -2929,6 +2929,10 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
   /* vector version will never be profitable.  */
   else
     {
+      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
+	warning_at (vect_location, OPT_Wopenmp_simd, "vectorization "
+		    "did not happen for a simd loop");
+
       if (dump_enabled_p ())
         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			 "cost model: the vector iteration cost = %d "
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 247bdfd..4b25964 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2171,7 +2171,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
     }
 
   /* Cost model: check if the vectorization is worthwhile.  */
-  if (!unlimited_cost_model ()
+  if (!unlimited_cost_model (NULL)
       && !vect_bb_vectorization_profitable_p (bb_vinfo))
     {
       if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index a6c5b59..56ad92c 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -919,9 +919,12 @@ known_alignment_for_access_p (struct data_reference *data_ref_info)
 
 /* Return true if the vect cost model is unlimited.  */
 static inline bool
-unlimited_cost_model ()
+unlimited_cost_model (loop_p loop)
 {
-  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
+  if (loop != NULL && loop->force_vect
+      && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
+    return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
+  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED);
 }
 
 /* Source location */

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-21 17:55                                                               ` Sergey Ostanevich
@ 2013-11-25 17:16                                                                 ` Sergey Ostanevich
  2013-11-26  1:50                                                                   ` Tobias Burnus
  2013-11-26 11:06                                                                   ` Richard Biener
  0 siblings, 2 replies; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-25 17:16 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Jakub Jelinek, Richard Biener, Richard Henderson,
	Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

[-- Attachment #1: Type: text/plain, Size: 4483 bytes --]

Updated patch with spaces, etc according to check_GNU_style.sh

Put guard as per Tobias' request.

Is it Ok?



On Thu, Nov 21, 2013 at 6:18 PM, Sergey Ostanevich <sergos.gnu@gmail.com> wrote:
> Tobias,
>
>
>> When I understand the patch correctly, the warning is shown in two cases:
>> a) When the loop could be vectorized but the cost model prevented it
>> b) When the loop couldn't be vectorized because of other reasons (e.g. not
>> vectorizable because of conditional loop exits, incomplete vectorization
>> support by the compiler etc.)
>>
>> Do I correctly understand the warning? I am asking because the *opt and
>> *texi wording suggests that only (a) is the case. - I cannot test as the
>> patch cannot be applied with heavy editing (removal of additional line
>> breaks, taking care of tabs converted into spaces).
>
> I believe it's only for a) case, since warning stays along with the cost
> model report that says only about relative scalar and vector costs of
> iteration. The case of exits and vectorization capabilities is handled earlier,
> since we have some vector code here.
>
> Will try to attach the patch instead of copy-paste here.
>
>>
>> Regarding the warning, I think it sounds a bit colloquial and as if the
>> location information is not available. What do you think of "loop with simd
>> directive not vectorized" or concise not fully correct: "simd loop not
>> vectorized"?
>
> took one of yours.
>
>>
>> Additionally, shouldn't that be guarded by "if (warn_openmp_simd &&"?
>> Otherwise the flag status isn't used at all in the whole patch.
>
> This is strange to me, since it worked as I pass the OPT_Wopenmp_simd
> to the warning_at (). It does:
>    show warinig with -Wopenmp-simd
>    doesn't show warning with -Wall -Wno-openmp-simd
>
>>
>>> +Wopenmp-simd
>>> +C C++ Var(warn_openmp_simd) Warning EnabledBy(Wall)
>>> +Warn about simd directive is overridden by vectorizer cost model
>>
>>
>> Wording wise, I'd prefer something like:
>> "Warn if an simd directive is overridden by the vectorizer cost model"
>>
>> (Or is it "a simd"? Where are the native speakers when one needs them?)
>
> damn, right! I believe 'a' since simd starts with consonant.
>
>>
>> However, in light of my question above, shouldn't it be "Warn if a loop with
>> simd directive is not vectorized"?
>>
>>
>>
>>> +fsimd-cost-model=
>>> +Common Joined RejectNegative Enum(vect_cost_model)
>>> Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
>>> +Specifies the vectorization cost model for code marked with simd
>>> directive
>>
>>
>> I think an article is lacking before "simd".
>
> done.
>
>>
>>
>>> +@item -Wopenmp-simd
>>> +@opindex Wopenm-simd
>>> +Warn if vectorizer cost model overrides simd directive from user.
>>
>>
>> I think that can be expanded a bit. One could also mention OpenMP/Cilk Plus
>> explicitly. Maybe like:  "Warn if the vectorizer cost model overrides the
>> OpenMP and Cilk Plus simd directives of the user."
>>
>
> done.
>
>> Or if my reading above is correct, how about something like: "Warn if a loop
>> with OpenMP or Cilk Plus simd directive is not vectorized. If only the cost
>> model prevented the vectorization, the @option{-fsimd-cost-model} option can
>> be used to force the vectorization."
>>
>> Which brings me to my next point: -fvect-cost-model= is not documented. I
>> think some words would be helpful, especially about the valid arguments, the
>> default and how it interacts with -fvect-cost-model=.
>
> done.
>
>>
>>
>>> --- a/gcc/fortran/lang.opt
>>> +++ b/gcc/fortran/lang.opt
>>>
>>> +Wopenmp-simd
>>> +Fortran Warning
>>> +; Documented in C
>>
>> ("Warning" is also not needed as it is taken from c-family/*opt, but it
>> shouldn't harm either.)
>
> done.
>
> Sergos
>
>         * common.opt: Added new option -fsimd-cost-model.
>         * tree-vectorizer.h (unlimited_cost_model): Interface update
>         to rely on particular loop info.
>         * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
>         unlimited_cost_model call according to new interface.
>         (vect_peeling_hash_choose_best_peeling): Ditto.
>         (vect_enhance_data_refs_alignment): Ditto.
>         * tree-vect-slp.c: Ditto.
>         * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
>         plus issue a warning in case cost model overrides users' directive.
>         * c.opt: add openmp-simd warning.
>         * lang.opt: Ditto.
>         * doc/invoke.texi: Added new openmp-simd warning.

[-- Attachment #2: patch6 --]
[-- Type: application/octet-stream, Size: 8508 bytes --]

2013-11-25  sergey.y.ostanevich  <sergos.gnu@gmail.com>

	* gcc/c-family/c.opt: Introduced a new openmp-simd warning.
	* gcc/fortran/lang.opt: Ditto.
	* gcc/common.opt: Introduced a new option -fsimd-cost-model.
	* gcc/doc/invoke.texi: Introduced a new openmp-simd warning and
	a new -fsimd-cost-model option.
	* gcc/tree-vectorizer.h (unlimited_cost_model): Interface updated
	to rely on the particular loop info.
	* gcc/tree-vect-data-refs.c (vect_peeling_hash_insert): Ditto.
	(vect_peeling_hash_choose_best_peeling): Ditto.
	(vect_enhance_data_refs_alignment): Ditto.
	* gcc/tree-vect-slp.c (vect_slp_analyze_bb_1): Ditto.
	* gcc/tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
	plus added openmp-simd warining.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index ac67885..2e9a3df 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -596,6 +596,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used
 
+Wopenmp-simd
+C C++ Var(warn_openmp_simd) Warning LangEnabledBy(C C++,Wall)
+Warn if a simd directive is overridden by the vectorizer cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index a7af636..a7f4f2f 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2300,10 +2300,17 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model) Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization
  
+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model) Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the vectorization cost model for code marked with a simd directive
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown vectorizer cost model %qs)
 
 EnumValue
+Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT)
+
+EnumValue
 Enum(vect_cost_model) String(unlimited) Value(VECT_COST_MODEL_UNLIMITED)
 
 EnumValue
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 501d080..5c8f08f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wlogical-op -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces  -Wmissing-field-initializers @gol
 -Wmissing-include-dirs @gol
--Wno-multichar  -Wnonnull  -Wno-overflow @gol
+-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
@@ -3321,6 +3321,7 @@ Options} and @ref{Objective-C and Objective-C++ Dialect Options}.
 -Wmaybe-uninitialized @gol
 -Wmissing-braces @r{(only for C/ObjC)} @gol
 -Wnonnull  @gol
+-Wopenmp-simd @gol
 -Wparentheses  @gol
 -Wpointer-sign  @gol
 -Wreorder   @gol
@@ -4815,6 +4816,12 @@ attribute.
 @opindex Woverflow
 Do not warn about compile-time overflow in constant expressions.
 
+@item -Wopenmp-simd
+@opindex Wopenm-simd
+Warn if the vectorizer cost model overrides the OpenMP or the Cilk Plus
+simd directive set by user.  The @option{-fsimd-cost-model=unlimited} can
+be used to relax the cost model.
+
 @item -Woverride-init @r{(C and Objective-C only)}
 @opindex Woverride-init
 @opindex Wno-override-init
@@ -8071,6 +8078,15 @@ is equal to the @code{dynamic} model.
 The default cost model depends on other optimization flags and is
 either @code{dynamic} or @code{cheap}.
 
+@item -fsimd-cost-model=@var{model}
+@opindex fsimd-cost-model
+Alter the cost model used for vectorization of loops marked with the OpenMP
+or Cilk Plus simd directive.  The @var{model} argument should be one of
+@code{unlimited}, @code{dynamic}, @code{cheap} or @code{default}.  The
+@code{default} model means to reuse model defined by @option{fvect-cost-model}.
+All other values of @var{model} have the same meaning as described in
+@option{fvect-cost-model}.
+
 @item -ftree-vrp
 @opindex ftree-vrp
 Perform Value Range Propagation on trees.  This is similar to the
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..0d328c8 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard
 
+Wopenmp-simd
+Fortran
+; Documented in C
+
 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 5e3b520..dd0ecc4 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1096,7 +1096,8 @@ vect_peeling_hash_insert (loop_vec_info loop_vinfo, struct data_reference *dr,
       *new_slot = slot;
     }
 
-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     slot->count += VECT_MAX_COST;
 }
 
@@ -1206,7 +1207,7 @@ vect_peeling_hash_choose_best_peeling (loop_vec_info loop_vinfo,
    res.peel_info.dr = NULL;
    res.body_cost_vec = stmt_vector_for_cost ();
 
-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
      {
        res.inside_cost = INT_MAX;
        res.outside_cost = INT_MAX;
@@ -1435,7 +1436,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
                  vectorization factor.
                  We do this automtically for cost model, since we calculate cost
                  for every peeling option.  */
-              if (unlimited_cost_model ())
+              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                 possible_npeel_number = vf /nelements;
 
               /* Handle the aligned case. We may decide to align some other
@@ -1443,7 +1444,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
               if (DR_MISALIGNMENT (dr) == 0)
                 {
                   npeel_tmp = 0;
-                  if (unlimited_cost_model ())
+                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                     possible_npeel_number++;
                 }
 
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index c91c2e1..fc8fa95 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2703,7 +2703,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
   void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
 
   /* Cost model disabled.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     {
       dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
       *ret_min_profitable_niters = 0;
@@ -2936,6 +2936,10 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
   /* vector version will never be profitable.  */
   else
     {
+      if (warn_openmp_simd && LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
+	warning_at (vect_location, OPT_Wopenmp_simd, "vectorization "
+		    "did not happen for a simd loop");
+
       if (dump_enabled_p ())
         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			 "cost model: the vector iteration cost = %d "
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 680a6d8..2387c0d 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2176,7 +2176,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
     }
 
   /* Cost model: check if the vectorization is worthwhile.  */
-  if (!unlimited_cost_model ()
+  if (!unlimited_cost_model (NULL)
       && !vect_bb_vectorization_profitable_p (bb_vinfo))
     {
       if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 58884f8..8013983 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -910,9 +910,12 @@ known_alignment_for_access_p (struct data_reference *data_ref_info)
 
 /* Return true if the vect cost model is unlimited.  */
 static inline bool
-unlimited_cost_model ()
+unlimited_cost_model (loop_p loop)
 {
-  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
+  if (loop != NULL && loop->force_vect
+      && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
+    return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
+  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED);
 }
 
 /* Source location */

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-25 17:16                                                                 ` Sergey Ostanevich
@ 2013-11-26  1:50                                                                   ` Tobias Burnus
  2013-11-26 11:06                                                                   ` Richard Biener
  1 sibling, 0 replies; 44+ messages in thread
From: Tobias Burnus @ 2013-11-26  1:50 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Jakub Jelinek, Richard Biener, Richard Henderson,
	Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

Sergey,

Thanks for the modifications and the patch. I tried your patch using 
gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c with the 
following change:

--- a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c
@@ -27,2 +27,3 @@ int main1 ()
    /* unaligned */
+#pragma omp simd
    for (i = 0; i < N/2; i++)

The result is with and without "pragma omp simd" in effect is the 
following for
   gcc -fdump-tree-vect-details -fopt-info-vec-optimized -c -O2 
-ftree-vectorize -fvect-cost-model=dynamic costmodel-vect-31.c
and
   gcc -fdump-tree-vect-details -fopt-info-vec-optimized -c -O2 
-ftree-vectorize -fvect-cost-model=dynamic -fsimd-cost-model=default 
-Wopenmp-simd -fopenmp-simd costmodel-vect-31.c

costmodel-vect-31.c:68:3: note: loop vectorized
costmodel-vect-31.c:68:3: note: loop peeled for vectorization to enhance 
alignment
costmodel-vect-31.c:55:3: note: loop vectorized
costmodel-vect-31.c:42:3: note: loop vectorized

Namely, the loop in line 29-32 is not vectorized. But also: 
-Wopenmp-simd doesn't warn!

If one now enables -fsimd-cost-model=unlimited, one gets the loop 
vectorized:

costmodel-vect-31.c:68:3: note: loop vectorized
costmodel-vect-31.c:68:3: note: loop peeled for vectorization to enhance 
alignment
costmodel-vect-31.c:55:3: note: loop vectorized
costmodel-vect-31.c:42:3: note: loop vectorized
costmodel-vect-31.c:31:16: note: loop vectorized
costmodel-vect-31.c:31:16: note: loop peeled for vectorization to 
enhance alignment


Thus, can you check why the warning doesn't work? Additionally, I'd find 
is useful to have a test case for both -Wopenmp-simd and for 
-fsimd-cost-model=unlimited.


Below, a few minor remarks to your patch.

[If possibly, try to attach the patch in text format (e.g. Content-Type: 
text/plain) instead of binary (Content-Type: application/octet-stream) 
that makes reviewing the patch in the email program a bit easier.]

Sergey Ostanevich wrote:
> 2013-11-25  sergey.y.ostanevich  <sergos.gnu@gmail.com>
>
> 	* gcc/c-family/c.opt: Introduced a new openmp-simd warning.
> 	* gcc/fortran/lang.opt: Ditto.
> 	* gcc/common.opt: Introduced a new option -fsimd-cost-model.
> 	* gcc/doc/invoke.texi: Introduced a new openmp-simd warning and
> 	a new -fsimd-cost-model option.
> 	* gcc/tree-vectorizer.h (unlimited_cost_model): Interface updated
> 	to rely on the particular loop info.
> 	* gcc/tree-vect-data-refs.c (vect_peeling_hash_insert): Ditto.
> 	(vect_peeling_hash_choose_best_peeling): Ditto.
> 	(vect_enhance_data_refs_alignment): Ditto.
> 	* gcc/tree-vect-slp.c (vect_slp_analyze_bb_1): Ditto.
> 	* gcc/tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
> 	plus added openmp-simd warining.

s/warining/warning/

Additionally, recall that gcc/, gcc/c-family and gcc/fortran have their 
own ChangeLog. Thus, at the end, the path names in those files should be 
relative to the relevant ChangeLog file ("c.opt", "lang.opt", 
"common.opt", "doc/invoke.texi" etc.)

> +@item -fsimd-cost-model=@var{model}
> ... The
> +@code{default} model means to reuse model defined by @option{fvect-cost-model}.
> +All other values of @var{model} have the same meaning as described in
> +@option{fvect-cost-model}.

I think it should be "to reuse the model" (with "the") and (twice) 
"-fvect..." instead of "fvect..." (i.e. with hyphen).

> @@ -2936,6 +2936,10 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>     /* vector version will never be profitable.  */
>     else
>       {
> +      if (warn_openmp_simd && LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> +	warning_at (vect_location, OPT_Wopenmp_simd, "vectorization "
> +		    "did not happen for a simd loop");

As the example shows, this code is seemingly not reached.

Otherwise, the code looks fine to me (although, I cannot approve 
anything but GCC's Fortran part).

Tobias

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-25 17:16                                                                 ` Sergey Ostanevich
  2013-11-26  1:50                                                                   ` Tobias Burnus
@ 2013-11-26 11:06                                                                   ` Richard Biener
  2013-11-26 11:18                                                                     ` Jakub Jelinek
  1 sibling, 1 reply; 44+ messages in thread
From: Richard Biener @ 2013-11-26 11:06 UTC (permalink / raw)
  To: Sergey Ostanevich
  Cc: Tobias Burnus, Jakub Jelinek, Richard Henderson, Yuri Rumyantsev,
	gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

On Mon, 25 Nov 2013, Sergey Ostanevich wrote:

> Updated patch with spaces, etc according to check_GNU_style.sh
> 
> Put guard as per Tobias' request.
> 
> Is it Ok?

See inline comments below (and Tobias mail).

> 
> 
> On Thu, Nov 21, 2013 at 6:18 PM, Sergey Ostanevich <sergos.gnu@gmail.com> wrote:
> > Tobias,
> >
> >
> >> When I understand the patch correctly, the warning is shown in two cases:
> >> a) When the loop could be vectorized but the cost model prevented it
> >> b) When the loop couldn't be vectorized because of other reasons (e.g. not
> >> vectorizable because of conditional loop exits, incomplete vectorization
> >> support by the compiler etc.)
> >>
> >> Do I correctly understand the warning? I am asking because the *opt and
> >> *texi wording suggests that only (a) is the case. - I cannot test as the
> >> patch cannot be applied with heavy editing (removal of additional line
> >> breaks, taking care of tabs converted into spaces).
> >
> > I believe it's only for a) case, since warning stays along with the cost
> > model report that says only about relative scalar and vector costs of
> > iteration. The case of exits and vectorization capabilities is handled earlier,
> > since we have some vector code here.
> >
> > Will try to attach the patch instead of copy-paste here.
> >
> >>
> >> Regarding the warning, I think it sounds a bit colloquial and as if the
> >> location information is not available. What do you think of "loop with simd
> >> directive not vectorized" or concise not fully correct: "simd loop not
> >> vectorized"?
> >
> > took one of yours.
> >
> >>
> >> Additionally, shouldn't that be guarded by "if (warn_openmp_simd &&"?
> >> Otherwise the flag status isn't used at all in the whole patch.
> >
> > This is strange to me, since it worked as I pass the OPT_Wopenmp_simd
> > to the warning_at (). It does:
> >    show warinig with -Wopenmp-simd
> >    doesn't show warning with -Wall -Wno-openmp-simd
> >
> >>
> >>> +Wopenmp-simd
> >>> +C C++ Var(warn_openmp_simd) Warning EnabledBy(Wall)
> >>> +Warn about simd directive is overridden by vectorizer cost model
> >>
> >>
> >> Wording wise, I'd prefer something like:
> >> "Warn if an simd directive is overridden by the vectorizer cost model"
> >>
> >> (Or is it "a simd"? Where are the native speakers when one needs them?)
> >
> > damn, right! I believe 'a' since simd starts with consonant.
> >
> >>
> >> However, in light of my question above, shouldn't it be "Warn if a loop with
> >> simd directive is not vectorized"?
> >>
> >>
> >>
> >>> +fsimd-cost-model=
> >>> +Common Joined RejectNegative Enum(vect_cost_model)
> >>> Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
> >>> +Specifies the vectorization cost model for code marked with simd
> >>> directive
> >>
> >>
> >> I think an article is lacking before "simd".
> >
> > done.
> >
> >>
> >>
> >>> +@item -Wopenmp-simd
> >>> +@opindex Wopenm-simd
> >>> +Warn if vectorizer cost model overrides simd directive from user.
> >>
> >>
> >> I think that can be expanded a bit. One could also mention OpenMP/Cilk Plus
> >> explicitly. Maybe like:  "Warn if the vectorizer cost model overrides the
> >> OpenMP and Cilk Plus simd directives of the user."
> >>
> >
> > done.
> >
> >> Or if my reading above is correct, how about something like: "Warn if a loop
> >> with OpenMP or Cilk Plus simd directive is not vectorized. If only the cost
> >> model prevented the vectorization, the @option{-fsimd-cost-model} option can
> >> be used to force the vectorization."
> >>
> >> Which brings me to my next point: -fvect-cost-model= is not documented. I
> >> think some words would be helpful, especially about the valid arguments, the
> >> default and how it interacts with -fvect-cost-model=.
> >
> > done.
> >
> >>
> >>
> >>> --- a/gcc/fortran/lang.opt
> >>> +++ b/gcc/fortran/lang.opt
> >>>
> >>> +Wopenmp-simd
> >>> +Fortran Warning
> >>> +; Documented in C
> >>
> >> ("Warning" is also not needed as it is taken from c-family/*opt, but it
> >> shouldn't harm either.)
> >
> > done.
> >
> > Sergos
> >
> >         * common.opt: Added new option -fsimd-cost-model.
> >         * tree-vectorizer.h (unlimited_cost_model): Interface update
> >         to rely on particular loop info.
> >         * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
> >         unlimited_cost_model call according to new interface.
> >         (vect_peeling_hash_choose_best_peeling): Ditto.
> >         (vect_enhance_data_refs_alignment): Ditto.
> >         * tree-vect-slp.c: Ditto.
> >         * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
> >         plus issue a warning in case cost model overrides users' directive.
> >         * c.opt: add openmp-simd warning.
> >         * lang.opt: Ditto.
> >         * doc/invoke.texi: Added new openmp-simd warning.

2013-11-25  sergey.y.ostanevich  <sergos.gnu@gmail.com>

	* gcc/c-family/c.opt: Introduced a new openmp-simd warning.

No gcc/c-family prefix, c-family has its own ChangeLog.

	* gcc/fortran/lang.opt: Ditto.

Likewise

	* gcc/common.opt: Introduced a new option -fsimd-cost-model.

No gcc/ prefix

	* gcc/doc/invoke.texi: Introduced a new openmp-simd warning and
	a new -fsimd-cost-model option.
	* gcc/tree-vectorizer.h (unlimited_cost_model): Interface updated
	to rely on the particular loop info.
	* gcc/tree-vect-data-refs.c (vect_peeling_hash_insert): Ditto.
	(vect_peeling_hash_choose_best_peeling): Ditto.
	(vect_enhance_data_refs_alignment): Ditto.
	* gcc/tree-vect-slp.c (vect_slp_analyze_bb_1): Ditto.
	* gcc/tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
	plus added openmp-simd warining.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index ac67885..2e9a3df 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -596,6 +596,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used
 
+Wopenmp-simd
+C C++ Var(warn_openmp_simd) Warning LangEnabledBy(C C++,Wall)
+Warn if a simd directive is overridden by the vectorizer cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index a7af636..a7f4f2f 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2300,10 +2300,17 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model) Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization
  
+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model) Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the vectorization cost model for code marked with a simd directive
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown vectorizer cost model %qs)
 
 EnumValue
+Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT)
+
+EnumValue

Don't add "default", it should be the default ;)  (so yes, please change
the default to VECT_COST_MODEL_DEFAULT)

 Enum(vect_cost_model) String(unlimited) Value(VECT_COST_MODEL_UNLIMITED)
 
 EnumValue
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 501d080..5c8f08f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wlogical-op -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces  -Wmissing-field-initializers @gol
 -Wmissing-include-dirs @gol
--Wno-multichar  -Wnonnull  -Wno-overflow @gol
+-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
@@ -3321,6 +3321,7 @@ Options} and @ref{Objective-C and Objective-C++ Dialect Options}.
 -Wmaybe-uninitialized @gol
 -Wmissing-braces @r{(only for C/ObjC)} @gol
 -Wnonnull  @gol
+-Wopenmp-simd @gol
 -Wparentheses  @gol
 -Wpointer-sign  @gol
 -Wreorder   @gol
@@ -4815,6 +4816,12 @@ attribute.
 @opindex Woverflow
 Do not warn about compile-time overflow in constant expressions.
 
+@item -Wopenmp-simd
+@opindex Wopenm-simd
+Warn if the vectorizer cost model overrides the OpenMP or the Cilk Plus
+simd directive set by user.  The @option{-fsimd-cost-model=unlimited} can
+be used to relax the cost model.
+
 @item -Woverride-init @r{(C and Objective-C only)}
 @opindex Woverride-init
 @opindex Wno-override-init
@@ -8071,6 +8078,15 @@ is equal to the @code{dynamic} model.
 The default cost model depends on other optimization flags and is
 either @code{dynamic} or @code{cheap}.
 
+@item -fsimd-cost-model=@var{model}
+@opindex fsimd-cost-model
+Alter the cost model used for vectorization of loops marked with the OpenMP
+or Cilk Plus simd directive.  The @var{model} argument should be one of
+@code{unlimited}, @code{dynamic}, @code{cheap} or @code{default}.  The
+@code{default} model means to reuse model defined by @option{fvect-cost-model}.
+All other values of @var{model} have the same meaning as described in
+@option{fvect-cost-model}.
+
 @item -ftree-vrp
 @opindex ftree-vrp
 Perform Value Range Propagation on trees.  This is similar to the
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..0d328c8 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard
 
+Wopenmp-simd
+Fortran
+; Documented in C
+

 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 5e3b520..dd0ecc4 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1096,7 +1096,8 @@ vect_peeling_hash_insert (loop_vec_info loop_vinfo, struct data_reference *dr,
       *new_slot = slot;
     }
 
-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     slot->count += VECT_MAX_COST;
 }
 
@@ -1206,7 +1207,7 @@ vect_peeling_hash_choose_best_peeling (loop_vec_info loop_vinfo,
    res.peel_info.dr = NULL;
    res.body_cost_vec = stmt_vector_for_cost ();
 
-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
      {
        res.inside_cost = INT_MAX;
        res.outside_cost = INT_MAX;
@@ -1435,7 +1436,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
                  vectorization factor.
                  We do this automtically for cost model, since we calculate cost
                  for every peeling option.  */
-              if (unlimited_cost_model ())
+              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                 possible_npeel_number = vf /nelements;
 
               /* Handle the aligned case. We may decide to align some other
@@ -1443,7 +1444,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
               if (DR_MISALIGNMENT (dr) == 0)
                 {
                   npeel_tmp = 0;
-                  if (unlimited_cost_model ())
+                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                     possible_npeel_number++;
                 }
 
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index c91c2e1..fc8fa95 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2703,7 +2703,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
   void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
 
   /* Cost model disabled.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     {
       dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
       *ret_min_profitable_niters = 0;
@@ -2936,6 +2936,10 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
   /* vector version will never be profitable.  */
   else
     {
+      if (warn_openmp_simd && LOOP_VINFO_LOOP (loop_vinfo)->force_vect)

Testing warn_openmp_simd is redundant here, it is taken care
of by warning_at and the OPT_Wopenmp_simd flag you passed

Otherwise the patch looks ok to me and is good for trunk.

Thanks,
Richard.


+	warning_at (vect_location, OPT_Wopenmp_simd, "vectorization "
+		    "did not happen for a simd loop");
+
       if (dump_enabled_p ())
         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			 "cost model: the vector iteration cost = %d "
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 680a6d8..2387c0d 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2176,7 +2176,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
     }
 
   /* Cost model: check if the vectorization is worthwhile.  */
-  if (!unlimited_cost_model ()
+  if (!unlimited_cost_model (NULL)
       && !vect_bb_vectorization_profitable_p (bb_vinfo))
     {
       if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 58884f8..8013983 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -910,9 +910,12 @@ known_alignment_for_access_p (struct data_reference *data_ref_info)
 
 /* Return true if the vect cost model is unlimited.  */
 static inline bool
-unlimited_cost_model ()
+unlimited_cost_model (loop_p loop)
 {
-  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
+  if (loop != NULL && loop->force_vect
+      && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
+    return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
+  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED);
 }
 
 /* Source location */

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-26 11:06                                                                   ` Richard Biener
@ 2013-11-26 11:18                                                                     ` Jakub Jelinek
  2013-11-27 10:30                                                                       ` Sergey Ostanevich
  0 siblings, 1 reply; 44+ messages in thread
From: Jakub Jelinek @ 2013-11-26 11:18 UTC (permalink / raw)
  To: Richard Biener
  Cc: Sergey Ostanevich, Tobias Burnus, Richard Henderson,
	Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

On Tue, Nov 26, 2013 at 10:43:32AM +0100, Richard Biener wrote:
> 2013-11-25  sergey.y.ostanevich  <sergos.gnu@gmail.com>

Please use your name with capital letters and spaces rather than
all lowercase plus dots.

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-26 11:18                                                                     ` Jakub Jelinek
@ 2013-11-27 10:30                                                                       ` Sergey Ostanevich
  2013-11-27 11:16                                                                         ` Tobias Burnus
  0 siblings, 1 reply; 44+ messages in thread
From: Sergey Ostanevich @ 2013-11-27 10:30 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Biener, Tobias Burnus, Richard Henderson,
	Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

Done.

Sergos

On Tue, Nov 26, 2013 at 1:46 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Nov 26, 2013 at 10:43:32AM +0100, Richard Biener wrote:
>> 2013-11-25  sergey.y.ostanevich  <sergos.gnu@gmail.com>
>
> Please use your name with capital letters and spaces rather than
> all lowercase plus dots.
>
>         Jakub

[-- Attachment #2: patch7.txt --]
[-- Type: text/plain, Size: 8221 bytes --]

2013-11-25  Sergey Ostanevich  <sergos.gnu@gmail.com>

	* c.opt: Introduced a new openmp-simd warning.
	* lang.opt: Ditto.
	* common.opt: Introduced a new option -fsimd-cost-model.
	* doc/invoke.texi: Introduced a new openmp-simd warning and
	a new -fsimd-cost-model option.
	* tree-vectorizer.h (unlimited_cost_model): Interface updated
	to rely on the particular loop info.
	* tree-vect-data-refs.c (vect_peeling_hash_insert): Ditto.
	(vect_peeling_hash_choose_best_peeling): Ditto.
	(vect_enhance_data_refs_alignment): Ditto.
	* tree-vect-slp.c (vect_slp_analyze_bb_1): Ditto.
	* tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
	plus added openmp-simd warining.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index ac67885..2e9a3df 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -596,6 +596,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used
 
+Wopenmp-simd
+C C++ Var(warn_openmp_simd) Warning LangEnabledBy(C C++,Wall)
+Warn if a simd directive is overridden by the vectorizer cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index a7af636..9ece683 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2300,6 +2300,10 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model) Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization
  
+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model) Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the vectorization cost model for code marked with a simd directive
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown vectorizer cost model %qs)
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 466eee0..2ff413c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wlogical-op -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces  -Wmissing-field-initializers @gol
 -Wmissing-include-dirs @gol
--Wno-multichar  -Wnonnull  -Wno-overflow @gol
+-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
@@ -3321,6 +3321,7 @@ Options} and @ref{Objective-C and Objective-C++ Dialect Options}.
 -Wmaybe-uninitialized @gol
 -Wmissing-braces @r{(only for C/ObjC)} @gol
 -Wnonnull  @gol
+-Wopenmp-simd @gol
 -Wparentheses  @gol
 -Wpointer-sign  @gol
 -Wreorder   @gol
@@ -4815,6 +4816,12 @@ attribute.
 @opindex Woverflow
 Do not warn about compile-time overflow in constant expressions.
 
+@item -Wopenmp-simd
+@opindex Wopenm-simd
+Warn if the vectorizer cost model overrides the OpenMP or the Cilk Plus
+simd directive set by user.  The @option{-fsimd-cost-model=unlimited} can
+be used to relax the cost model.
+
 @item -Woverride-init @r{(C and Objective-C only)}
 @opindex Woverride-init
 @opindex Wno-override-init
@@ -8071,6 +8078,14 @@ is equal to the @code{dynamic} model.
 The default cost model depends on other optimization flags and is
 either @code{dynamic} or @code{cheap}.
 
+@item -fsimd-cost-model=@var{model}
+@opindex fsimd-cost-model
+Alter the cost model used for vectorization of loops marked with the OpenMP
+or Cilk Plus simd directive.  The @var{model} argument should be one of
+@code{unlimited}, @code{dynamic}, @code{cheap}.  All values of @var{model}
+have the same meaning as described in @option{fvect-cost-model} and by
+default a cost model defined with @option{fvect-cost-model} is used.
+
 @item -ftree-vrp
 @opindex ftree-vrp
 Perform Value Range Propagation on trees.  This is similar to the
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..0d328c8 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard
 
+Wopenmp-simd
+Fortran
+; Documented in C
+
 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 8261645..d3e6e99 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1096,7 +1096,8 @@ vect_peeling_hash_insert (loop_vec_info loop_vinfo, struct data_reference *dr,
       *new_slot = slot;
     }
 
-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+      && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     slot->count += VECT_MAX_COST;
 }
 
@@ -1206,7 +1207,7 @@ vect_peeling_hash_choose_best_peeling (loop_vec_info loop_vinfo,
    res.peel_info.dr = NULL;
    res.body_cost_vec = stmt_vector_for_cost ();
 
-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
      {
        res.inside_cost = INT_MAX;
        res.outside_cost = INT_MAX;
@@ -1435,7 +1436,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
                  vectorization factor.
                  We do this automtically for cost model, since we calculate cost
                  for every peeling option.  */
-              if (unlimited_cost_model ())
+              if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                 possible_npeel_number = vf /nelements;
 
               /* Handle the aligned case. We may decide to align some other
@@ -1443,7 +1444,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
               if (DR_MISALIGNMENT (dr) == 0)
                 {
                   npeel_tmp = 0;
-                  if (unlimited_cost_model ())
+                  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
                     possible_npeel_number++;
                 }
 
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index c91c2e1..9b46879 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2703,7 +2703,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
   void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
 
   /* Cost model disabled.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
     {
       dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
       *ret_min_profitable_niters = 0;
@@ -2936,6 +2936,10 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
   /* vector version will never be profitable.  */
   else
     {
+      if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
+	warning_at (vect_location, OPT_Wopenmp_simd, "vectorization "
+		    "did not happen for a simd loop");
+
       if (dump_enabled_p ())
         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			 "cost model: the vector iteration cost = %d "
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 680a6d8..2387c0d 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2176,7 +2176,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
     }
 
   /* Cost model: check if the vectorization is worthwhile.  */
-  if (!unlimited_cost_model ()
+  if (!unlimited_cost_model (NULL)
       && !vect_bb_vectorization_profitable_p (bb_vinfo))
     {
       if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 58884f8..8013983 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -910,9 +910,12 @@ known_alignment_for_access_p (struct data_reference *data_ref_info)
 
 /* Return true if the vect cost model is unlimited.  */
 static inline bool
-unlimited_cost_model ()
+unlimited_cost_model (loop_p loop)
 {
-  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
+  if (loop != NULL && loop->force_vect
+      && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
+    return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
+  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED);
 }
 
 /* Source location */

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-27 10:30                                                                       ` Sergey Ostanevich
@ 2013-11-27 11:16                                                                         ` Tobias Burnus
  2013-11-27 11:47                                                                           ` Richard Biener
  0 siblings, 1 reply; 44+ messages in thread
From: Tobias Burnus @ 2013-11-27 11:16 UTC (permalink / raw)
  To: Sergey Ostanevich, Jakub Jelinek
  Cc: Richard Biener, Richard Henderson, Yuri Rumyantsev, gcc-patches,
	Igor Zamyatin, Areg Melik-Adamyan

[-- Attachment #1: Type: text/plain, Size: 632 bytes --]

Am 27.11.2013 08:22, schrieb Sergey Ostanevich:
> Done.
>

Thanks for fixing Richard's and Jakub's comments and parts of mine.

> +have the same meaning as described in @option{fvect-cost-model} and by
> +default a cost model defined with @option{fvect-cost-model} is used.
As mentioned before, pleae add a hyphen before fvect (i.e. 
"@option{fvect-cost-model}"  -> "@option{-fvect-cost-model}")

Regarding a test case: I still think it would be useful to have one, but I somehow seemed to have messed up with my previous one - I fail to get it not to vectorize with "omp simd" due to cost reasons. Thus, I don't have one.

Tobias


[-- Attachment #2: costmodel-vect-simd-1.c --]
[-- Type: text/x-csrc, Size: 1099 bytes --]

/* { dg-require-effective-target vect_int } */
/* { dg-additional-options "-fopenmp-simd -Wopenmp-simd -mno-sse2" } */

/* NOTE: Without -mno-sse2, the loop is vectorized.  */

#include <stdarg.h>
#include "../../tree-vect.h"

#define N 32

struct t{
  int k[N];
  int l; 
};
  
struct s{
  char a;	/* aligned */
  char b[N-1];  /* unaligned (offset 1B) */
  char c[N];    /* aligned (offset NB) */
  struct t d;   /* aligned (offset 2NB) */
  struct t e;   /* unaligned (offset 2N+4N+4 B) */
};
 
struct s tmp = { 1 };

int main1 ()
{  
  int i;

  /* unaligned */
#pragma omp simd
  for (i = 0; i < N/2; i++)
    {
      tmp.b[i] = 5; /* { dg-warning "vectorization did not happen for a simd loop" } */
    }

  /* check results:  */
  for (i = 0; i <N/2; i++)
    {
      if (tmp.b[i] != 5)
        abort ();
    }

  return 0;
}

int main (void)
{ 
  check_vect ();
  
  return main1 ();
} 

/* { dg-final { scan-tree-dump-times "loop vectorized" 0 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" } } */
/* ! dg-final ! cleanup-tree-dump "vect" ! ! */

[-- Attachment #3: costmodel-vect-simd-2.c --]
[-- Type: text/x-csrc, Size: 1056 bytes --]

/* { dg-require-effective-target vect_int } */
/* { dg-additional-options "-fopenmp-simd -Wopenmp-simd -fsimd-cost-model=unlimited" } */

#include <stdarg.h>
#include "../../tree-vect.h"

#define N 32

struct t{
  int k[N];
  int l; 
};
  
struct s{
  char a;	/* aligned */
  char b[N-1];  /* unaligned (offset 1B) */
  char c[N];    /* aligned (offset NB) */
  struct t d;   /* aligned (offset 2NB) */
  struct t e;   /* unaligned (offset 2N+4N+4 B) */
};
 
struct s tmp = { 1 };

int main1 ()
{  
  int i;

  /* unaligned */
#pragma omp simd
  for (i = 0; i < N/2; i++)
    {
      tmp.b[i] = 5; /* dg-warning "vectorization did not happen for a simd loop" */
    }

  /* check results:  */
  for (i = 0; i <N/2; i++)
    {
      if (tmp.b[i] != 5)
        abort ();
    }

  return 0;
}

int main (void)
{ 
  check_vect ();
  
  return main1 ();
} 

/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorization not profitable" 0 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-27 11:16                                                                         ` Tobias Burnus
@ 2013-11-27 11:47                                                                           ` Richard Biener
  2013-11-27 12:15                                                                             ` Jakub Jelinek
  0 siblings, 1 reply; 44+ messages in thread
From: Richard Biener @ 2013-11-27 11:47 UTC (permalink / raw)
  To: Tobias Burnus
  Cc: Sergey Ostanevich, Jakub Jelinek, Richard Henderson,
	Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

On Wed, 27 Nov 2013, Tobias Burnus wrote:

> Am 27.11.2013 08:22, schrieb Sergey Ostanevich:
> > Done.
> > 
> 
> Thanks for fixing Richard's and Jakub's comments and parts of mine.
> 
> > +have the same meaning as described in @option{fvect-cost-model} and by
> > +default a cost model defined with @option{fvect-cost-model} is used.
> As mentioned before, pleae add a hyphen before fvect (i.e.
> "@option{fvect-cost-model}"  -> "@option{-fvect-cost-model}")
> 
> Regarding a test case: I still think it would be useful to have one, but I
> somehow seemed to have messed up with my previous one - I fail to get it not
> to vectorize with "omp simd" due to cost reasons. Thus, I don't have one.

Ok with that change - no need to re-post before applying.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.
  2013-11-27 11:47                                                                           ` Richard Biener
@ 2013-11-27 12:15                                                                             ` Jakub Jelinek
  0 siblings, 0 replies; 44+ messages in thread
From: Jakub Jelinek @ 2013-11-27 12:15 UTC (permalink / raw)
  To: Richard Biener
  Cc: Tobias Burnus, Sergey Ostanevich, Richard Henderson,
	Yuri Rumyantsev, gcc-patches, Igor Zamyatin, Areg Melik-Adamyan

On Wed, Nov 27, 2013 at 10:58:30AM +0100, Richard Biener wrote:
> On Wed, 27 Nov 2013, Tobias Burnus wrote:
> 
> > Am 27.11.2013 08:22, schrieb Sergey Ostanevich:
> > > Done.
> > > 
> > 
> > Thanks for fixing Richard's and Jakub's comments and parts of mine.
> > 
> > > +have the same meaning as described in @option{fvect-cost-model} and by
> > > +default a cost model defined with @option{fvect-cost-model} is used.
> > As mentioned before, pleae add a hyphen before fvect (i.e.
> > "@option{fvect-cost-model}"  -> "@option{-fvect-cost-model}")
> > 
> > Regarding a test case: I still think it would be useful to have one, but I
> > somehow seemed to have messed up with my previous one - I fail to get it not
> > to vectorize with "omp simd" due to cost reasons. Thus, I don't have one.
> 
> Ok with that change - no need to re-post before applying.

Note the start of the ChangeLog is still wrong:

2013-11-25  Sergey Ostanevich  <sergos.gnu@gmail.com>

        * c.opt: Introduced a new openmp-simd warning.
        * lang.opt: Ditto.

Because:

2013-11-25  Sergey Ostanevich  <sergos.gnu@gmail.com>

        * c.opt (Wopenmp-simd): New.

needs to go into the gcc/c-family/ChangeLog and

2013-11-25  Sergey Ostanevich  <sergos.gnu@gmail.com>

        * lang.opt (Wopenmp-simd): New.

needs to go into the gcc/fortran/ChangeLog, you can't use ditto there
because it is a different file.  Please fix that up before committing.

	Jakub

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2013-11-27 10:37 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-31 15:44 [gomp4 simd, RFC] Simple fix to override vectorization cost estimation Yuri Rumyantsev
2013-10-31 16:19 ` Jakub Jelinek
2013-10-31 19:10   ` Richard Biener
2013-11-12 13:18     ` Sergey Ostanevich
2013-11-12 13:46       ` Jakub Jelinek
2013-11-12 14:16         ` Sergey Ostanevich
2013-11-12 14:28           ` Jakub Jelinek
2013-11-12 14:49             ` Sergey Ostanevich
2013-11-12 15:16               ` Jakub Jelinek
2013-11-12 15:39                 ` Richard Biener
2013-11-12 16:15                   ` Jakub Jelinek
2013-11-12 18:59                   ` Sergey Ostanevich
2013-11-13  9:59                     ` Richard Biener
2013-11-13 18:04                       ` Sergey Ostanevich
2013-11-14 10:16                         ` Richard Biener
2013-11-14 20:51                           ` Sergey Ostanevich
2013-11-14 22:31                             ` Richard Biener
2013-11-15 14:25                               ` Sergey Ostanevich
2013-11-15 15:11                                 ` Jakub Jelinek
2013-11-15 15:24                                 ` Richard Biener
2013-11-18 16:23                                   ` Sergey Ostanevich
2013-11-18 16:45                                     ` Richard Biener
2013-11-19 14:48                                       ` Sergey Ostanevich
2013-11-19 14:57                                         ` Richard Biener
2013-11-19 14:58                                           ` Jakub Jelinek
2013-11-19 15:07                                             ` Sergey Ostanevich
2013-11-19 15:08                                               ` Jakub Jelinek
2013-11-19 21:08                                                 ` Sergey Ostanevich
2013-11-19 21:45                                                   ` Tobias Burnus
2013-11-20 14:05                                                     ` Sergey Ostanevich
2013-11-20 15:11                                                       ` Richard Biener
2013-11-20 15:44                                                         ` Jakub Jelinek
2013-11-20 16:11                                                           ` Sergey Ostanevich
2013-11-20 21:27                                                             ` Tobias Burnus
2013-11-21 17:55                                                               ` Sergey Ostanevich
2013-11-25 17:16                                                                 ` Sergey Ostanevich
2013-11-26  1:50                                                                   ` Tobias Burnus
2013-11-26 11:06                                                                   ` Richard Biener
2013-11-26 11:18                                                                     ` Jakub Jelinek
2013-11-27 10:30                                                                       ` Sergey Ostanevich
2013-11-27 11:16                                                                         ` Tobias Burnus
2013-11-27 11:47                                                                           ` Richard Biener
2013-11-27 12:15                                                                             ` Jakub Jelinek
2013-11-19 15:02                                           ` Sergey Ostanevich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).