public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
@ 2022-01-12 10:13 Andre Vieira (lists)
  2022-01-12 10:29 ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Andre Vieira (lists) @ 2022-01-12 10:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford, Richard Biener

[-- Attachment #1: Type: text/plain, Size: 1139 bytes --]

Hi,

This a fix for the regression caused by '[vect] Re-analyze all modes for 
epilogues'. The earlier patch assumed there was always at least one 
other mode than VOIDmode, but that does not need to be the case.
If we are dealing with a target that does not define more modes for 
'autovectorize_vector_modes', the behaviour before the patch would be to 
try to create an epilogue for the same autodetected_vector_mode, which 
unless the target supported partial vectors would always fail. So as a 
fix I suggest trying to vectorize the epilogue with the 
preferred_simd_mode for QI, mimicking autovectorize_vector_mode, which 
will be skipped if it is not a vector_mode (since that already should 
indicate partial vectors aren't possible) or if no partial vectors are 
supported and its pessimistic NUNITS is larger than the main loop's VF.

Currently bootstrapping and regression testing, otherwise OK for trunk? 
Can someone verify this fixes the issue for PR103971 on powerpc?

gcc/ChangeLog:

         * tree-vect-loop.c (vect-analyze-loop): Handle scenario where 
target
         does add autovectorize_vector_modes.

[-- Attachment #2: fix_epilogue_modes.patch --]
[-- Type: text/plain, Size: 1074 bytes --]

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 6ed2b5f8724e5ebf27592f67d7f6bdfe1ebcf512..c81ebc411312e649f9cd954895244c60c928fee1 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -3024,6 +3024,18 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
      ordering is not guaranteed, so we could end up picking a mode for the main
      loop that is after the epilogue's optimal mode.  */
   mode_i = 1;
+  /* If we only had VOIDmode then push the AUTODETECTED_VECTOR_MODE to see if
+     an epilogue can be created with that mode.  */
+  if (vector_modes.length () == 1)
+    {
+      machine_mode preferred_mode
+	= targetm.vectorize.preferred_simd_mode (QImode);
+      /* If the preferred mode isn't a vector mode we will not be needing an
+	  epilogue.  */
+      if (!VECTOR_MODE_P (preferred_mode))
+	return first_loop_vinfo;
+      vector_modes.safe_push (preferred_mode);
+    }
   bool supports_partial_vectors = partial_vectors_supported_p ();
   poly_uint64 first_vinfo_vf = LOOP_VINFO_VECT_FACTOR (first_loop_vinfo);
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 10:13 [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only Andre Vieira (lists)
@ 2022-01-12 10:29 ` Richard Biener
  2022-01-12 10:33   ` Richard Sandiford
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2022-01-12 10:29 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: gcc-patches, Richard Sandiford

On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:

> Hi,
> 
> This a fix for the regression caused by '[vect] Re-analyze all modes for
> epilogues'. The earlier patch assumed there was always at least one other mode
> than VOIDmode, but that does not need to be the case.
> If we are dealing with a target that does not define more modes for
> 'autovectorize_vector_modes', the behaviour before the patch would be to try
> to create an epilogue for the same autodetected_vector_mode, which unless the
> target supported partial vectors would always fail. So as a fix I suggest
> trying to vectorize the epilogue with the preferred_simd_mode for QI,
> mimicking autovectorize_vector_mode, which will be skipped if it is not a
> vector_mode (since that already should indicate partial vectors aren't
> possible) or if no partial vectors are supported and its pessimistic NUNITS is
> larger than the main loop's VF.
> 
> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
> someone verify this fixes the issue for PR103971 on powerpc?

Why not simply start at mode_i = 0 which means autodetecting the mode
to use for the epilogue?  That appears to be a much simpler solution to
me, including for targets where there are more than one element in the
vector.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 10:29 ` Richard Biener
@ 2022-01-12 10:33   ` Richard Sandiford
  2022-01-12 10:40     ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Sandiford @ 2022-01-12 10:33 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andre Vieira (lists), gcc-patches

Richard Biener <rguenther@suse.de> writes:
> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>
>> Hi,
>> 
>> This a fix for the regression caused by '[vect] Re-analyze all modes for
>> epilogues'. The earlier patch assumed there was always at least one other mode
>> than VOIDmode, but that does not need to be the case.
>> If we are dealing with a target that does not define more modes for
>> 'autovectorize_vector_modes', the behaviour before the patch would be to try
>> to create an epilogue for the same autodetected_vector_mode, which unless the
>> target supported partial vectors would always fail. So as a fix I suggest
>> trying to vectorize the epilogue with the preferred_simd_mode for QI,
>> mimicking autovectorize_vector_mode, which will be skipped if it is not a
>> vector_mode (since that already should indicate partial vectors aren't
>> possible) or if no partial vectors are supported and its pessimistic NUNITS is
>> larger than the main loop's VF.
>> 
>> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
>> someone verify this fixes the issue for PR103971 on powerpc?
>
> Why not simply start at mode_i = 0 which means autodetecting the mode
> to use for the epilogue?  That appears to be a much simpler solution to
> me, including for targets where there are more than one element in the
> vector.

VOIDmode doesn't tell us anything about what the autodetected mode
will be, so current short-circuit:

      /* If the target does not support partial vectors we can shorten the
	 number of modes to analyze for the epilogue as we know we can't pick a
	 mode that has at least as many NUNITS as the main loop's vectorization
	 factor, since that would imply the epilogue's vectorization factor
	 would be at least as high as the main loop's and we would be
	 vectorizing for more scalar iterations than there would be left.  */
      if (!supports_partial_vectors
	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
	{
	  mode_i++;
	  if (mode_i == vector_modes.length ())
	    break;
	  continue;
	}

wouldn't be effective.

Thanks,
Richard


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 10:33   ` Richard Sandiford
@ 2022-01-12 10:40     ` Richard Biener
  2022-01-12 10:43       ` Richard Sandiford
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2022-01-12 10:40 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Andre Vieira (lists), gcc-patches

On Wed, 12 Jan 2022, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >
> >> Hi,
> >> 
> >> This a fix for the regression caused by '[vect] Re-analyze all modes for
> >> epilogues'. The earlier patch assumed there was always at least one other mode
> >> than VOIDmode, but that does not need to be the case.
> >> If we are dealing with a target that does not define more modes for
> >> 'autovectorize_vector_modes', the behaviour before the patch would be to try
> >> to create an epilogue for the same autodetected_vector_mode, which unless the
> >> target supported partial vectors would always fail. So as a fix I suggest
> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
> >> mimicking autovectorize_vector_mode, which will be skipped if it is not a
> >> vector_mode (since that already should indicate partial vectors aren't
> >> possible) or if no partial vectors are supported and its pessimistic NUNITS is
> >> larger than the main loop's VF.
> >> 
> >> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
> >> someone verify this fixes the issue for PR103971 on powerpc?
> >
> > Why not simply start at mode_i = 0 which means autodetecting the mode
> > to use for the epilogue?  That appears to be a much simpler solution to
> > me, including for targets where there are more than one element in the
> > vector.
> 
> VOIDmode doesn't tell us anything about what the autodetected mode
> will be, so current short-circuit:
> 
>       /* If the target does not support partial vectors we can shorten the
> 	 number of modes to analyze for the epilogue as we know we can't pick a
> 	 mode that has at least as many NUNITS as the main loop's vectorization
> 	 factor, since that would imply the epilogue's vectorization factor
> 	 would be at least as high as the main loop's and we would be
> 	 vectorizing for more scalar iterations than there would be left.  */
>       if (!supports_partial_vectors
> 	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
> 	{
> 	  mode_i++;
> 	  if (mode_i == vector_modes.length ())
> 	    break;
> 	  continue;
> 	}
> 
> wouldn't be effective.

Well, before this change we simply did

-  /* Handle the case that the original loop can use partial
-     vectorization, but want to only adopt it for the epilogue.
-     The retry should be in the same mode as original.  */
-  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
...
-  else
-    {
-      mode_i = first_loop_next_i;
-      if (mode_i == vector_modes.length ())
-       return first_loop_vinfo;
-    }

and thus didn't bother with epilogue vectorization.  I think we should
then just restore this behavior, not doing epilogue vectorization
if vector_modes.length () == 1?

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 10:40     ` Richard Biener
@ 2022-01-12 10:43       ` Richard Sandiford
  2022-01-12 10:52         ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Sandiford @ 2022-01-12 10:43 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andre Vieira (lists), gcc-patches

Richard Biener <rguenther@suse.de> writes:
> On Wed, 12 Jan 2022, Richard Sandiford wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>> >
>> >> Hi,
>> >> 
>> >> This a fix for the regression caused by '[vect] Re-analyze all modes for
>> >> epilogues'. The earlier patch assumed there was always at least one other mode
>> >> than VOIDmode, but that does not need to be the case.
>> >> If we are dealing with a target that does not define more modes for
>> >> 'autovectorize_vector_modes', the behaviour before the patch would be to try
>> >> to create an epilogue for the same autodetected_vector_mode, which unless the
>> >> target supported partial vectors would always fail. So as a fix I suggest
>> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
>> >> mimicking autovectorize_vector_mode, which will be skipped if it is not a
>> >> vector_mode (since that already should indicate partial vectors aren't
>> >> possible) or if no partial vectors are supported and its pessimistic NUNITS is
>> >> larger than the main loop's VF.
>> >> 
>> >> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
>> >> someone verify this fixes the issue for PR103971 on powerpc?
>> >
>> > Why not simply start at mode_i = 0 which means autodetecting the mode
>> > to use for the epilogue?  That appears to be a much simpler solution to
>> > me, including for targets where there are more than one element in the
>> > vector.
>> 
>> VOIDmode doesn't tell us anything about what the autodetected mode
>> will be, so current short-circuit:
>> 
>>       /* If the target does not support partial vectors we can shorten the
>> 	 number of modes to analyze for the epilogue as we know we can't pick a
>> 	 mode that has at least as many NUNITS as the main loop's vectorization
>> 	 factor, since that would imply the epilogue's vectorization factor
>> 	 would be at least as high as the main loop's and we would be
>> 	 vectorizing for more scalar iterations than there would be left.  */
>>       if (!supports_partial_vectors
>> 	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
>> 	{
>> 	  mode_i++;
>> 	  if (mode_i == vector_modes.length ())
>> 	    break;
>> 	  continue;
>> 	}
>> 
>> wouldn't be effective.
>
> Well, before this change we simply did
>
> -  /* Handle the case that the original loop can use partial
> -     vectorization, but want to only adopt it for the epilogue.
> -     The retry should be in the same mode as original.  */
> -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> ...
> -  else
> -    {
> -      mode_i = first_loop_next_i;
> -      if (mode_i == vector_modes.length ())
> -       return first_loop_vinfo;
> -    }
>
> and thus didn't bother with epilogue vectorization.  I think we should
> then just restore this behavior, not doing epilogue vectorization
> if vector_modes.length () == 1?

Yeah, but that case didn't need epilogue vectorisation before.  This
series is adding support for unrolling, and targets with a single vector
size will benefit from epilogues in that case.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 10:43       ` Richard Sandiford
@ 2022-01-12 10:52         ` Richard Biener
  2022-01-12 11:02           ` Richard Sandiford
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2022-01-12 10:52 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Andre Vieira (lists), gcc-patches

On Wed, 12 Jan 2022, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >
> >> Richard Biener <rguenther@suse.de> writes:
> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >> >
> >> >> Hi,
> >> >> 
> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes for
> >> >> epilogues'. The earlier patch assumed there was always at least one other mode
> >> >> than VOIDmode, but that does not need to be the case.
> >> >> If we are dealing with a target that does not define more modes for
> >> >> 'autovectorize_vector_modes', the behaviour before the patch would be to try
> >> >> to create an epilogue for the same autodetected_vector_mode, which unless the
> >> >> target supported partial vectors would always fail. So as a fix I suggest
> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is not a
> >> >> vector_mode (since that already should indicate partial vectors aren't
> >> >> possible) or if no partial vectors are supported and its pessimistic NUNITS is
> >> >> larger than the main loop's VF.
> >> >> 
> >> >> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
> >> >> someone verify this fixes the issue for PR103971 on powerpc?
> >> >
> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
> >> > to use for the epilogue?  That appears to be a much simpler solution to
> >> > me, including for targets where there are more than one element in the
> >> > vector.
> >> 
> >> VOIDmode doesn't tell us anything about what the autodetected mode
> >> will be, so current short-circuit:
> >> 
> >>       /* If the target does not support partial vectors we can shorten the
> >> 	 number of modes to analyze for the epilogue as we know we can't pick a
> >> 	 mode that has at least as many NUNITS as the main loop's vectorization
> >> 	 factor, since that would imply the epilogue's vectorization factor
> >> 	 would be at least as high as the main loop's and we would be
> >> 	 vectorizing for more scalar iterations than there would be left.  */
> >>       if (!supports_partial_vectors
> >> 	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
> >> 	{
> >> 	  mode_i++;
> >> 	  if (mode_i == vector_modes.length ())
> >> 	    break;
> >> 	  continue;
> >> 	}
> >> 
> >> wouldn't be effective.
> >
> > Well, before this change we simply did
> >
> > -  /* Handle the case that the original loop can use partial
> > -     vectorization, but want to only adopt it for the epilogue.
> > -     The retry should be in the same mode as original.  */
> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> > ...
> > -  else
> > -    {
> > -      mode_i = first_loop_next_i;
> > -      if (mode_i == vector_modes.length ())
> > -       return first_loop_vinfo;
> > -    }
> >
> > and thus didn't bother with epilogue vectorization.  I think we should
> > then just restore this behavior, not doing epilogue vectorization
> > if vector_modes.length () == 1?
> 
> Yeah, but that case didn't need epilogue vectorisation before.  This
> series is adding support for unrolling, and targets with a single vector
> size will benefit from epilogues in that case.

But in that case (which we could detect), we could then just use
autodetected_vector_mode?  Like if we do before epilogue vect

 vector_modes[0] = autodetected_vector_mode;
 mode_i = 0;

thus replace VOIDmode with what we detected and then start at 0?
That is, the proposed patch looks very much like a hack to me.

I suppose the VECTOR_MODE_P check should be added to

      if (!supports_partial_vectors
          && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
first_vinfo_vf))
        {
          mode_i++;

instead.

> Thanks,
> Richard
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 10:52         ` Richard Biener
@ 2022-01-12 11:02           ` Richard Sandiford
  2022-01-12 11:28             ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Sandiford @ 2022-01-12 11:02 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andre Vieira (lists), gcc-patches

Richard Biener <rguenther@suse.de> writes:
> On Wed, 12 Jan 2022, Richard Sandiford wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
>> >
>> >> Richard Biener <rguenther@suse.de> writes:
>> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>> >> >
>> >> >> Hi,
>> >> >> 
>> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes for
>> >> >> epilogues'. The earlier patch assumed there was always at least one other mode
>> >> >> than VOIDmode, but that does not need to be the case.
>> >> >> If we are dealing with a target that does not define more modes for
>> >> >> 'autovectorize_vector_modes', the behaviour before the patch would be to try
>> >> >> to create an epilogue for the same autodetected_vector_mode, which unless the
>> >> >> target supported partial vectors would always fail. So as a fix I suggest
>> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
>> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is not a
>> >> >> vector_mode (since that already should indicate partial vectors aren't
>> >> >> possible) or if no partial vectors are supported and its pessimistic NUNITS is
>> >> >> larger than the main loop's VF.
>> >> >> 
>> >> >> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
>> >> >> someone verify this fixes the issue for PR103971 on powerpc?
>> >> >
>> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
>> >> > to use for the epilogue?  That appears to be a much simpler solution to
>> >> > me, including for targets where there are more than one element in the
>> >> > vector.
>> >> 
>> >> VOIDmode doesn't tell us anything about what the autodetected mode
>> >> will be, so current short-circuit:
>> >> 
>> >>       /* If the target does not support partial vectors we can shorten the
>> >> 	 number of modes to analyze for the epilogue as we know we can't pick a
>> >> 	 mode that has at least as many NUNITS as the main loop's vectorization
>> >> 	 factor, since that would imply the epilogue's vectorization factor
>> >> 	 would be at least as high as the main loop's and we would be
>> >> 	 vectorizing for more scalar iterations than there would be left.  */
>> >>       if (!supports_partial_vectors
>> >> 	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
>> >> 	{
>> >> 	  mode_i++;
>> >> 	  if (mode_i == vector_modes.length ())
>> >> 	    break;
>> >> 	  continue;
>> >> 	}
>> >> 
>> >> wouldn't be effective.
>> >
>> > Well, before this change we simply did
>> >
>> > -  /* Handle the case that the original loop can use partial
>> > -     vectorization, but want to only adopt it for the epilogue.
>> > -     The retry should be in the same mode as original.  */
>> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
>> > ...
>> > -  else
>> > -    {
>> > -      mode_i = first_loop_next_i;
>> > -      if (mode_i == vector_modes.length ())
>> > -       return first_loop_vinfo;
>> > -    }
>> >
>> > and thus didn't bother with epilogue vectorization.  I think we should
>> > then just restore this behavior, not doing epilogue vectorization
>> > if vector_modes.length () == 1?
>> 
>> Yeah, but that case didn't need epilogue vectorisation before.  This
>> series is adding support for unrolling, and targets with a single vector
>> size will benefit from epilogues in that case.
>
> But in that case (which we could detect), we could then just use
> autodetected_vector_mode?  Like if we do before epilogue vect
>
>  vector_modes[0] = autodetected_vector_mode;
>  mode_i = 0;
>
> thus replace VOIDmode with what we detected and then start at 0?
> That is, the proposed patch looks very much like a hack to me.

You mean check whether the loop is unrolled?  If so, that's what feels
like a hack to me :-)  The question is whether there are enough elements
for epilogue vectorisation to make sense.  The VF is what tells us that.
Unrolling is just one of the things that influences that VF and I don't
think we should check for the individual influences.  It's just the end
result that matters.

> I suppose the VECTOR_MODE_P check should be added to
>
>       if (!supports_partial_vectors
>           && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
> first_vinfo_vf))
>         {
>           mode_i++;
>
> instead.

You mean:

      if (!supports_partial_vectors
	  && VECTOR_MODE_P (vector_modes[mode_i])
	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
	{
	  mode_i++;

?  If so, the skip won't be effective the first time round.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 11:02           ` Richard Sandiford
@ 2022-01-12 11:28             ` Richard Biener
  2022-01-12 11:44               ` Richard Sandiford
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2022-01-12 11:28 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Andre Vieira (lists), gcc-patches

On Wed, 12 Jan 2022, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >
> >> Richard Biener <rguenther@suse.de> writes:
> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >> >
> >> >> Richard Biener <rguenther@suse.de> writes:
> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >> >> >
> >> >> >> Hi,
> >> >> >> 
> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes for
> >> >> >> epilogues'. The earlier patch assumed there was always at least one other mode
> >> >> >> than VOIDmode, but that does not need to be the case.
> >> >> >> If we are dealing with a target that does not define more modes for
> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would be to try
> >> >> >> to create an epilogue for the same autodetected_vector_mode, which unless the
> >> >> >> target supported partial vectors would always fail. So as a fix I suggest
> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is not a
> >> >> >> vector_mode (since that already should indicate partial vectors aren't
> >> >> >> possible) or if no partial vectors are supported and its pessimistic NUNITS is
> >> >> >> larger than the main loop's VF.
> >> >> >> 
> >> >> >> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
> >> >> >> someone verify this fixes the issue for PR103971 on powerpc?
> >> >> >
> >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
> >> >> > to use for the epilogue?  That appears to be a much simpler solution to
> >> >> > me, including for targets where there are more than one element in the
> >> >> > vector.
> >> >> 
> >> >> VOIDmode doesn't tell us anything about what the autodetected mode
> >> >> will be, so current short-circuit:
> >> >> 
> >> >>       /* If the target does not support partial vectors we can shorten the
> >> >> 	 number of modes to analyze for the epilogue as we know we can't pick a
> >> >> 	 mode that has at least as many NUNITS as the main loop's vectorization
> >> >> 	 factor, since that would imply the epilogue's vectorization factor
> >> >> 	 would be at least as high as the main loop's and we would be
> >> >> 	 vectorizing for more scalar iterations than there would be left.  */
> >> >>       if (!supports_partial_vectors
> >> >> 	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
> >> >> 	{
> >> >> 	  mode_i++;
> >> >> 	  if (mode_i == vector_modes.length ())
> >> >> 	    break;
> >> >> 	  continue;
> >> >> 	}
> >> >> 
> >> >> wouldn't be effective.
> >> >
> >> > Well, before this change we simply did
> >> >
> >> > -  /* Handle the case that the original loop can use partial
> >> > -     vectorization, but want to only adopt it for the epilogue.
> >> > -     The retry should be in the same mode as original.  */
> >> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> >> > ...
> >> > -  else
> >> > -    {
> >> > -      mode_i = first_loop_next_i;
> >> > -      if (mode_i == vector_modes.length ())
> >> > -       return first_loop_vinfo;
> >> > -    }
> >> >
> >> > and thus didn't bother with epilogue vectorization.  I think we should
> >> > then just restore this behavior, not doing epilogue vectorization
> >> > if vector_modes.length () == 1?
> >> 
> >> Yeah, but that case didn't need epilogue vectorisation before.  This
> >> series is adding support for unrolling, and targets with a single vector
> >> size will benefit from epilogues in that case.
> >
> > But in that case (which we could detect), we could then just use
> > autodetected_vector_mode?  Like if we do before epilogue vect
> >
> >  vector_modes[0] = autodetected_vector_mode;
> >  mode_i = 0;
> >
> > thus replace VOIDmode with what we detected and then start at 0?
> > That is, the proposed patch looks very much like a hack to me.
> 
> You mean check whether the loop is unrolled?  If so, that's what feels
> like a hack to me :-)  The question is whether there are enough elements
> for epilogue vectorisation to make sense.  The VF is what tells us that.
> Unrolling is just one of the things that influences that VF and I don't
> think we should check for the individual influences.  It's just the end
> result that matters.
> 
> > I suppose the VECTOR_MODE_P check should be added to
> >
> >       if (!supports_partial_vectors
> >           && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
> > first_vinfo_vf))
> >         {
> >           mode_i++;
> >
> > instead.
> 
> You mean:
> 
>       if (!supports_partial_vectors
> 	  && VECTOR_MODE_P (vector_modes[mode_i])
> 	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
> 	{
> 	  mode_i++;
> 
> ?  If so, the skip won't be effective the first time round.

Why?  See above where I set vector_modes[0] to autodetected_vector_mode.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 11:28             ` Richard Biener
@ 2022-01-12 11:44               ` Richard Sandiford
  2022-01-12 11:56                 ` Andre Vieira (lists)
  2022-01-12 11:57                 ` Richard Biener
  0 siblings, 2 replies; 16+ messages in thread
From: Richard Sandiford @ 2022-01-12 11:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: Andre Vieira (lists), gcc-patches

Richard Biener <rguenther@suse.de> writes:
> On Wed, 12 Jan 2022, Richard Sandiford wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
>> >
>> >> Richard Biener <rguenther@suse.de> writes:
>> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
>> >> >
>> >> >> Richard Biener <rguenther@suse.de> writes:
>> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>> >> >> >
>> >> >> >> Hi,
>> >> >> >> 
>> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes for
>> >> >> >> epilogues'. The earlier patch assumed there was always at least one other mode
>> >> >> >> than VOIDmode, but that does not need to be the case.
>> >> >> >> If we are dealing with a target that does not define more modes for
>> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would be to try
>> >> >> >> to create an epilogue for the same autodetected_vector_mode, which unless the
>> >> >> >> target supported partial vectors would always fail. So as a fix I suggest
>> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
>> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is not a
>> >> >> >> vector_mode (since that already should indicate partial vectors aren't
>> >> >> >> possible) or if no partial vectors are supported and its pessimistic NUNITS is
>> >> >> >> larger than the main loop's VF.
>> >> >> >> 
>> >> >> >> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
>> >> >> >> someone verify this fixes the issue for PR103971 on powerpc?
>> >> >> >
>> >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
>> >> >> > to use for the epilogue?  That appears to be a much simpler solution to
>> >> >> > me, including for targets where there are more than one element in the
>> >> >> > vector.
>> >> >> 
>> >> >> VOIDmode doesn't tell us anything about what the autodetected mode
>> >> >> will be, so current short-circuit:
>> >> >> 
>> >> >>       /* If the target does not support partial vectors we can shorten the
>> >> >> 	 number of modes to analyze for the epilogue as we know we can't pick a
>> >> >> 	 mode that has at least as many NUNITS as the main loop's vectorization
>> >> >> 	 factor, since that would imply the epilogue's vectorization factor
>> >> >> 	 would be at least as high as the main loop's and we would be
>> >> >> 	 vectorizing for more scalar iterations than there would be left.  */
>> >> >>       if (!supports_partial_vectors
>> >> >> 	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
>> >> >> 	{
>> >> >> 	  mode_i++;
>> >> >> 	  if (mode_i == vector_modes.length ())
>> >> >> 	    break;
>> >> >> 	  continue;
>> >> >> 	}
>> >> >> 
>> >> >> wouldn't be effective.
>> >> >
>> >> > Well, before this change we simply did
>> >> >
>> >> > -  /* Handle the case that the original loop can use partial
>> >> > -     vectorization, but want to only adopt it for the epilogue.
>> >> > -     The retry should be in the same mode as original.  */
>> >> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
>> >> > ...
>> >> > -  else
>> >> > -    {
>> >> > -      mode_i = first_loop_next_i;
>> >> > -      if (mode_i == vector_modes.length ())
>> >> > -       return first_loop_vinfo;
>> >> > -    }
>> >> >
>> >> > and thus didn't bother with epilogue vectorization.  I think we should
>> >> > then just restore this behavior, not doing epilogue vectorization
>> >> > if vector_modes.length () == 1?
>> >> 
>> >> Yeah, but that case didn't need epilogue vectorisation before.  This
>> >> series is adding support for unrolling, and targets with a single vector
>> >> size will benefit from epilogues in that case.
>> >
>> > But in that case (which we could detect), we could then just use
>> > autodetected_vector_mode?  Like if we do before epilogue vect
>> >
>> >  vector_modes[0] = autodetected_vector_mode;
>> >  mode_i = 0;
>> >
>> > thus replace VOIDmode with what we detected and then start at 0?
>> > That is, the proposed patch looks very much like a hack to me.
>> 
>> You mean check whether the loop is unrolled?  If so, that's what feels
>> like a hack to me :-)  The question is whether there are enough elements
>> for epilogue vectorisation to make sense.  The VF is what tells us that.
>> Unrolling is just one of the things that influences that VF and I don't
>> think we should check for the individual influences.  It's just the end
>> result that matters.
>> 
>> > I suppose the VECTOR_MODE_P check should be added to
>> >
>> >       if (!supports_partial_vectors
>> >           && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
>> > first_vinfo_vf))
>> >         {
>> >           mode_i++;
>> >
>> > instead.
>> 
>> You mean:
>> 
>>       if (!supports_partial_vectors
>> 	  && VECTOR_MODE_P (vector_modes[mode_i])
>> 	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
>> 	{
>> 	  mode_i++;
>> 
>> ?  If so, the skip won't be effective the first time round.
>
> Why?  See above where I set vector_modes[0] to autodetected_vector_mode.

Ah, yeah, I guess that works, sorry.  It still feels odd to iterate
through N+1 modes when we don't need autodetection (and with the above,
don't use autodetection), but I can live with it. :-)

Another alternative would be to push autodetected_vector_mode when the
length is 1 and keep 1 as the starting point.

Richard

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 11:44               ` Richard Sandiford
@ 2022-01-12 11:56                 ` Andre Vieira (lists)
  2022-01-12 11:59                   ` Richard Biener
  2022-01-12 11:57                 ` Richard Biener
  1 sibling, 1 reply; 16+ messages in thread
From: Andre Vieira (lists) @ 2022-01-12 11:56 UTC (permalink / raw)
  To: Richard Biener, gcc-patches, richard.sandiford


On 12/01/2022 11:44, Richard Sandiford wrote:
> Another alternative would be to push autodetected_vector_mode when the
> length is 1 and keep 1 as the starting point.
>
> Richard

I'm guessing we would still want to skip epilogue vectorization if 
!VECTOR_MODE_P (autodetected_vector_mode) in that case?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 11:44               ` Richard Sandiford
  2022-01-12 11:56                 ` Andre Vieira (lists)
@ 2022-01-12 11:57                 ` Richard Biener
  1 sibling, 0 replies; 16+ messages in thread
From: Richard Biener @ 2022-01-12 11:57 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Andre Vieira (lists), gcc-patches

On Wed, 12 Jan 2022, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >
> >> Richard Biener <rguenther@suse.de> writes:
> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >> >
> >> >> Richard Biener <rguenther@suse.de> writes:
> >> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >> >> >
> >> >> >> Richard Biener <rguenther@suse.de> writes:
> >> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >> >> >> >
> >> >> >> >> Hi,
> >> >> >> >> 
> >> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes for
> >> >> >> >> epilogues'. The earlier patch assumed there was always at least one other mode
> >> >> >> >> than VOIDmode, but that does not need to be the case.
> >> >> >> >> If we are dealing with a target that does not define more modes for
> >> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would be to try
> >> >> >> >> to create an epilogue for the same autodetected_vector_mode, which unless the
> >> >> >> >> target supported partial vectors would always fail. So as a fix I suggest
> >> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
> >> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is not a
> >> >> >> >> vector_mode (since that already should indicate partial vectors aren't
> >> >> >> >> possible) or if no partial vectors are supported and its pessimistic NUNITS is
> >> >> >> >> larger than the main loop's VF.
> >> >> >> >> 
> >> >> >> >> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
> >> >> >> >> someone verify this fixes the issue for PR103971 on powerpc?
> >> >> >> >
> >> >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
> >> >> >> > to use for the epilogue?  That appears to be a much simpler solution to
> >> >> >> > me, including for targets where there are more than one element in the
> >> >> >> > vector.
> >> >> >> 
> >> >> >> VOIDmode doesn't tell us anything about what the autodetected mode
> >> >> >> will be, so current short-circuit:
> >> >> >> 
> >> >> >>       /* If the target does not support partial vectors we can shorten the
> >> >> >> 	 number of modes to analyze for the epilogue as we know we can't pick a
> >> >> >> 	 mode that has at least as many NUNITS as the main loop's vectorization
> >> >> >> 	 factor, since that would imply the epilogue's vectorization factor
> >> >> >> 	 would be at least as high as the main loop's and we would be
> >> >> >> 	 vectorizing for more scalar iterations than there would be left.  */
> >> >> >>       if (!supports_partial_vectors
> >> >> >> 	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
> >> >> >> 	{
> >> >> >> 	  mode_i++;
> >> >> >> 	  if (mode_i == vector_modes.length ())
> >> >> >> 	    break;
> >> >> >> 	  continue;
> >> >> >> 	}
> >> >> >> 
> >> >> >> wouldn't be effective.
> >> >> >
> >> >> > Well, before this change we simply did
> >> >> >
> >> >> > -  /* Handle the case that the original loop can use partial
> >> >> > -     vectorization, but want to only adopt it for the epilogue.
> >> >> > -     The retry should be in the same mode as original.  */
> >> >> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> >> >> > ...
> >> >> > -  else
> >> >> > -    {
> >> >> > -      mode_i = first_loop_next_i;
> >> >> > -      if (mode_i == vector_modes.length ())
> >> >> > -       return first_loop_vinfo;
> >> >> > -    }
> >> >> >
> >> >> > and thus didn't bother with epilogue vectorization.  I think we should
> >> >> > then just restore this behavior, not doing epilogue vectorization
> >> >> > if vector_modes.length () == 1?
> >> >> 
> >> >> Yeah, but that case didn't need epilogue vectorisation before.  This
> >> >> series is adding support for unrolling, and targets with a single vector
> >> >> size will benefit from epilogues in that case.
> >> >
> >> > But in that case (which we could detect), we could then just use
> >> > autodetected_vector_mode?  Like if we do before epilogue vect
> >> >
> >> >  vector_modes[0] = autodetected_vector_mode;
> >> >  mode_i = 0;
> >> >
> >> > thus replace VOIDmode with what we detected and then start at 0?
> >> > That is, the proposed patch looks very much like a hack to me.
> >> 
> >> You mean check whether the loop is unrolled?  If so, that's what feels
> >> like a hack to me :-)  The question is whether there are enough elements
> >> for epilogue vectorisation to make sense.  The VF is what tells us that.
> >> Unrolling is just one of the things that influences that VF and I don't
> >> think we should check for the individual influences.  It's just the end
> >> result that matters.
> >> 
> >> > I suppose the VECTOR_MODE_P check should be added to
> >> >
> >> >       if (!supports_partial_vectors
> >> >           && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
> >> > first_vinfo_vf))
> >> >         {
> >> >           mode_i++;
> >> >
> >> > instead.
> >> 
> >> You mean:
> >> 
> >>       if (!supports_partial_vectors
> >> 	  && VECTOR_MODE_P (vector_modes[mode_i])
> >> 	  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
> >> 	{
> >> 	  mode_i++;
> >> 
> >> ?  If so, the skip won't be effective the first time round.
> >
> > Why?  See above where I set vector_modes[0] to autodetected_vector_mode.
> 
> Ah, yeah, I guess that works, sorry.  It still feels odd to iterate
> through N+1 modes when we don't need autodetection (and with the above,
> don't use autodetection), but I can live with it. :-)

Of course we do autodetection anyway when doing main loop vectorization,
and when we want to iterate over all modes for epilogues it feels
natural to try that in the same order.

> Another alternative would be to push autodetected_vector_mode when the
> length is 1 and keep 1 as the starting point.

I understood that this is what Andres patch tried to do - just inventing
some "other" autodetected mode.  Note my idea eventually allows us to
get rid of autodetected_vector_mode as separate variable and just
have it be a reference to vector_modes[0] (or use that explicitely).

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 11:56                 ` Andre Vieira (lists)
@ 2022-01-12 11:59                   ` Richard Biener
  2022-01-12 12:41                     ` Andre Vieira (lists)
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2022-01-12 11:59 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: gcc-patches, richard.sandiford

On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:

> 
> On 12/01/2022 11:44, Richard Sandiford wrote:
> > Another alternative would be to push autodetected_vector_mode when the
> > length is 1 and keep 1 as the starting point.
> >
> > Richard
> 
> I'm guessing we would still want to skip epilogue vectorization if
> !VECTOR_MODE_P (autodetected_vector_mode) in that case?

Practically we currently only support fixed width word_mode there,
but eventually one could end up with 64bit DImode for the main loop
and 32bit V4QImode in the epilogue ... so not sure if it's worth
special-casing.  But I don't mind adding that skip.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 11:59                   ` Richard Biener
@ 2022-01-12 12:41                     ` Andre Vieira (lists)
  2022-01-12 12:57                       ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Andre Vieira (lists) @ 2022-01-12 12:41 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, richard.sandiford

[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]


On 12/01/2022 11:59, Richard Biener wrote:
> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>
>> On 12/01/2022 11:44, Richard Sandiford wrote:
>>> Another alternative would be to push autodetected_vector_mode when the
>>> length is 1 and keep 1 as the starting point.
>>>
>>> Richard
>> I'm guessing we would still want to skip epilogue vectorization if
>> !VECTOR_MODE_P (autodetected_vector_mode) in that case?
> Practically we currently only support fixed width word_mode there,
> but eventually one could end up with 64bit DImode for the main loop
> and 32bit V4QImode in the epilogue ... so not sure if it's worth
> special-casing.  But I don't mind adding that skip.
>
> Richard.

I left out the skip, it shouldn't break anything as it would try that 
same mode before anyway.
Just to clarify what I meant though was to skip if 
autodetected_vector_mode wasn't a vector AND the target didn't define 
autovectorize_vector_modes, so in that scenario it wouldn't ever try  
V4QImode for the epilogue if the mainloop was autodetected DImode, I 
think...
Either way, this is less code, less complicated and doesn't analyze more 
than it did before the original patch, so I'm happy with that too.

Is this what you had in mind?

Andre

[-- Attachment #2: fix_epilogue_modes2.patch --]
[-- Type: text/plain, Size: 939 bytes --]

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 6ed2b5f8724e5ebf27592f67d7f6bdfe1ebcf512..03459363afa48f0e2753bc7aa18cbf2771d2a4e5 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -3023,7 +3023,16 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared)
      array may contain length-agnostic and length-specific modes.  Their
      ordering is not guaranteed, so we could end up picking a mode for the main
      loop that is after the epilogue's optimal mode.  */
-  mode_i = 1;
+  if (vector_modes.length () == 1)
+    {
+      /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see if
+	 an epilogue can be created with that mode.  */
+      vector_modes[0] = autodetected_vector_mode;
+      mode_i = 0;
+    }
+  else
+    mode_i = 1;
+
   bool supports_partial_vectors = partial_vectors_supported_p ();
   poly_uint64 first_vinfo_vf = LOOP_VINFO_VECT_FACTOR (first_loop_vinfo);
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 12:41                     ` Andre Vieira (lists)
@ 2022-01-12 12:57                       ` Richard Biener
  2022-01-12 13:48                         ` Bill Schmidt
  2022-01-12 14:42                         ` Andre Vieira (lists)
  0 siblings, 2 replies; 16+ messages in thread
From: Richard Biener @ 2022-01-12 12:57 UTC (permalink / raw)
  To: Andre Vieira (lists); +Cc: gcc-patches, richard.sandiford

On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:

> 
> On 12/01/2022 11:59, Richard Biener wrote:
> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >
> >> On 12/01/2022 11:44, Richard Sandiford wrote:
> >>> Another alternative would be to push autodetected_vector_mode when the
> >>> length is 1 and keep 1 as the starting point.
> >>>
> >>> Richard
> >> I'm guessing we would still want to skip epilogue vectorization if
> >> !VECTOR_MODE_P (autodetected_vector_mode) in that case?
> > Practically we currently only support fixed width word_mode there,
> > but eventually one could end up with 64bit DImode for the main loop
> > and 32bit V4QImode in the epilogue ... so not sure if it's worth
> > special-casing.  But I don't mind adding that skip.
> >
> > Richard.
> 
> I left out the skip, it shouldn't break anything as it would try that same
> mode before anyway.
> Just to clarify what I meant though was to skip if autodetected_vector_mode
> wasn't a vector AND the target didn't define autovectorize_vector_modes, so in
> that scenario it wouldn't ever try  V4QImode for the epilogue if the mainloop
> was autodetected DImode, I think...
> Either way, this is less code, less complicated and doesn't analyze more than
> it did before the original patch, so I'm happy with that too.
> 
> Is this what you had in mind?

-  mode_i = 1;
+  if (vector_modes.length () == 1)
+    {
+      /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see
if
+        an epilogue can be created with that mode.  */
+      vector_modes[0] = autodetected_vector_mode;
+      mode_i = 0;
+    }
+  else
+    mode_i = 1;
+

I would have left out the condition and unconditionally do

  vector_modes[0] = autodetected_vector_mode;
  mode_i = 0;

but OK if you think it makes sense to special case length == 1.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 12:57                       ` Richard Biener
@ 2022-01-12 13:48                         ` Bill Schmidt
  2022-01-12 14:42                         ` Andre Vieira (lists)
  1 sibling, 0 replies; 16+ messages in thread
From: Bill Schmidt @ 2022-01-12 13:48 UTC (permalink / raw)
  To: Richard Biener, Andre Vieira (lists); +Cc: richard.sandiford, gcc-patches

I think we need a fix or a revert for this today, please.  Bootstrap has been broken
for a couple of days during the last week of stage 3, which is really problematic.

Thanks,
Bill

On 1/12/22 6:57 AM, Richard Biener via Gcc-patches wrote:
> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>
>> On 12/01/2022 11:59, Richard Biener wrote:
>>> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>>>
>>>> On 12/01/2022 11:44, Richard Sandiford wrote:
>>>>> Another alternative would be to push autodetected_vector_mode when the
>>>>> length is 1 and keep 1 as the starting point.
>>>>>
>>>>> Richard
>>>> I'm guessing we would still want to skip epilogue vectorization if
>>>> !VECTOR_MODE_P (autodetected_vector_mode) in that case?
>>> Practically we currently only support fixed width word_mode there,
>>> but eventually one could end up with 64bit DImode for the main loop
>>> and 32bit V4QImode in the epilogue ... so not sure if it's worth
>>> special-casing.  But I don't mind adding that skip.
>>>
>>> Richard.
>> I left out the skip, it shouldn't break anything as it would try that same
>> mode before anyway.
>> Just to clarify what I meant though was to skip if autodetected_vector_mode
>> wasn't a vector AND the target didn't define autovectorize_vector_modes, so in
>> that scenario it wouldn't ever try  V4QImode for the epilogue if the mainloop
>> was autodetected DImode, I think...
>> Either way, this is less code, less complicated and doesn't analyze more than
>> it did before the original patch, so I'm happy with that too.
>>
>> Is this what you had in mind?
> -  mode_i = 1;
> +  if (vector_modes.length () == 1)
> +    {
> +      /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see
> if
> +        an epilogue can be created with that mode.  */
> +      vector_modes[0] = autodetected_vector_mode;
> +      mode_i = 0;
> +    }
> +  else
> +    mode_i = 1;
> +
>
> I would have left out the condition and unconditionally do
>
>   vector_modes[0] = autodetected_vector_mode;
>   mode_i = 0;
>
> but OK if you think it makes sense to special case length == 1.
>
> Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only
  2022-01-12 12:57                       ` Richard Biener
  2022-01-12 13:48                         ` Bill Schmidt
@ 2022-01-12 14:42                         ` Andre Vieira (lists)
  1 sibling, 0 replies; 16+ messages in thread
From: Andre Vieira (lists) @ 2022-01-12 14:42 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, richard.sandiford


On 12/01/2022 12:57, Richard Biener wrote:
> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>
>> On 12/01/2022 11:59, Richard Biener wrote:
>>> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>>>
>>>> On 12/01/2022 11:44, Richard Sandiford wrote:
>>>>> Another alternative would be to push autodetected_vector_mode when the
>>>>> length is 1 and keep 1 as the starting point.
>>>>>
>>>>> Richard
>>>> I'm guessing we would still want to skip epilogue vectorization if
>>>> !VECTOR_MODE_P (autodetected_vector_mode) in that case?
>>> Practically we currently only support fixed width word_mode there,
>>> but eventually one could end up with 64bit DImode for the main loop
>>> and 32bit V4QImode in the epilogue ... so not sure if it's worth
>>> special-casing.  But I don't mind adding that skip.
>>>
>>> Richard.
>> I left out the skip, it shouldn't break anything as it would try that same
>> mode before anyway.
>> Just to clarify what I meant though was to skip if autodetected_vector_mode
>> wasn't a vector AND the target didn't define autovectorize_vector_modes, so in
>> that scenario it wouldn't ever try  V4QImode for the epilogue if the mainloop
>> was autodetected DImode, I think...
>> Either way, this is less code, less complicated and doesn't analyze more than
>> it did before the original patch, so I'm happy with that too.
>>
>> Is this what you had in mind?
> -  mode_i = 1;
> +  if (vector_modes.length () == 1)
> +    {
> +      /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see
> if
> +        an epilogue can be created with that mode.  */
> +      vector_modes[0] = autodetected_vector_mode;
> +      mode_i = 0;
> +    }
> +  else
> +    mode_i = 1;
> +
>
> I would have left out the condition and unconditionally do
>
>    vector_modes[0] = autodetected_vector_mode;
>    mode_i = 0;
>
> but OK if you think it makes sense to special case length == 1.
>
> Richard.

Tested without the special casing, all good, only have performance 
regressions left (which I'm working on), will commit this to fix the ICEs.

Thanks,
Andre


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-01-12 14:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-12 10:13 [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only Andre Vieira (lists)
2022-01-12 10:29 ` Richard Biener
2022-01-12 10:33   ` Richard Sandiford
2022-01-12 10:40     ` Richard Biener
2022-01-12 10:43       ` Richard Sandiford
2022-01-12 10:52         ` Richard Biener
2022-01-12 11:02           ` Richard Sandiford
2022-01-12 11:28             ` Richard Biener
2022-01-12 11:44               ` Richard Sandiford
2022-01-12 11:56                 ` Andre Vieira (lists)
2022-01-12 11:59                   ` Richard Biener
2022-01-12 12:41                     ` Andre Vieira (lists)
2022-01-12 12:57                       ` Richard Biener
2022-01-12 13:48                         ` Bill Schmidt
2022-01-12 14:42                         ` Andre Vieira (lists)
2022-01-12 11:57                 ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).