public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Possibly latent issue with combine ?
@ 2019-06-26 13:58 Prathamesh Kulkarni
  2019-06-26 16:18 ` Segher Boessenkool
  0 siblings, 1 reply; 4+ messages in thread
From: Prathamesh Kulkarni @ 2019-06-26 13:58 UTC (permalink / raw)
  To: GCC Development, Richard Sandiford

Hi,
For following test-case, taken from pr88152.C:

#include <x86intrin.h>

template <typename T, size_t N>
using V [[gnu::vector_size(N)]] = T;

int f10 (V<unsigned long long, 16> a)
{
  return _mm_movemask_pd (reinterpret_cast<__m128d> (a > __LONG_LONG_MAX__));
}

.optimized dump shows:

f10 (V a)
{
  vector(2) signed long _1;
  vector(2) long int _2;
  vector(2) double _3;
  int _6;

  <bb 2> [local count: 1073741824]:
  _1 = VIEW_CONVERT_EXPR<vector(2) signed long>(a_4(D));
  _2 = VEC_COND_EXPR <_1 < { 0, 0 }, { -1, -1 }, { 0, 0 }>;
  _3 = VIEW_CONVERT_EXPR<__m128d>(_2);
  _6 = __builtin_ia32_movmskpd (_3); [tail call]
  return _6;

}

IIUC, we're using -1 to represent true and 0 as false.
combine then does following combinations:

Trying 7 -> 9:
    7: r90:V2DI=r89:V2DI>r93:V2DI
      REG_DEAD r93:V2DI
      REG_DEAD r89:V2DI
    9: r91:V2DF=r90:V2DI#0
      REG_DEAD r90:V2DI
Successfully matched this instruction:
(set (subreg:V2DI (reg:V2DF 91) 0)
    (gt:V2DI (reg:V2DI 89)
        (reg:V2DI 93)))
allowing combination of insns 7 and 9

Trying 6, 9 -> 10:
    6: r89:V2DI=const_vector
    9: r91:V2DF#0=r89:V2DI>r93:V2DI
      REG_DEAD r89:V2DI
      REG_DEAD r93:V2DI
   10: r87:SI=unspec[r91:V2DF] 43
      REG_DEAD r91:V2DF
Successfully matched this instruction:
(set (reg:SI 87)
    (unspec:SI [
            (lt:V2DF (reg:V2DI 93)
                (const_vector:V2DI [
                        (const_int 0 [0]) repeated x2
                    ]))
        ] UNSPEC_MOVMSK))

Is the above folding correct, since lt has V2DF mode,
and casting -1 (literally) to DFmode would result in -NaN ?
Also, should result of lt be having only integral modes ?

split2 then folds insn 10 into:

(insn 22 9 16 2 (set (reg:SI 0 ax [87])
        (unspec:SI [
                (reg:V2DF 20 xmm0 [93])
            ] UNSPEC_MOVMSK))
"../../stage1-build/gcc/include/emmintrin.h":958:34 4222
{sse2_movmskpd}
     (nil))

deleting insn 10.

The issue is my patch for PR88833 results in following propagation in forwprop1:

In insn 10, replacing
 (unspec:SI [
            (reg:V2DF 91)
        ] UNSPEC_MOVMSK)
 with (unspec:SI [
            (subreg:V2DF (reg:V2DI 90) 0)
        ] UNSPEC_MOVMSK)

deleting insn 9 and this inhibits the above combinations,
resulting in failure of PR88152.C

With patch, combine shows:
Trying 7 -> 10:
    7: r90:V2DI=r89:V2DI>r93:V2DI
      REG_DEAD r93:V2DI
      REG_DEAD r89:V2DI
   10: r87:SI=unspec[r90:V2DI#0] 43
      REG_DEAD r90:V2DI
Failed to match this instruction:
(set (reg:SI 87)
    (unspec:SI [
            (subreg:V2DF (gt:V2DI (reg:V2DI 89)
                    (reg:V2DI 93)) 0)
        ] UNSPEC_MOVMSK))

and subsequently fails to match 6, 7 -> 10

Patch:
http://people.linaro.org/~prathamesh.kulkarni/pr88833-10.diff

Upstream discussion about the issue:
https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01651.html

Thanks,
Prathamesh

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Possibly latent issue with combine ?
  2019-06-26 13:58 Possibly latent issue with combine ? Prathamesh Kulkarni
@ 2019-06-26 16:18 ` Segher Boessenkool
  2019-06-26 16:45   ` Richard Sandiford
  0 siblings, 1 reply; 4+ messages in thread
From: Segher Boessenkool @ 2019-06-26 16:18 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: GCC Development, Richard Sandiford

On Wed, Jun 26, 2019 at 07:27:20PM +0530, Prathamesh Kulkarni wrote:
> combine then does following combinations:
> 
> Trying 7 -> 9:
>     7: r90:V2DI=r89:V2DI>r93:V2DI
>       REG_DEAD r93:V2DI
>       REG_DEAD r89:V2DI
>     9: r91:V2DF=r90:V2DI#0
>       REG_DEAD r90:V2DI
> Successfully matched this instruction:
> (set (subreg:V2DI (reg:V2DF 91) 0)
>     (gt:V2DI (reg:V2DI 89)
>         (reg:V2DI 93)))
> allowing combination of insns 7 and 9
> 
> Trying 6, 9 -> 10:
>     6: r89:V2DI=const_vector
>     9: r91:V2DF#0=r89:V2DI>r93:V2DI
>       REG_DEAD r89:V2DI
>       REG_DEAD r93:V2DI
>    10: r87:SI=unspec[r91:V2DF] 43
>       REG_DEAD r91:V2DF
> Successfully matched this instruction:
> (set (reg:SI 87)
>     (unspec:SI [
>             (lt:V2DF (reg:V2DI 93)
>                 (const_vector:V2DI [
>                         (const_int 0 [0]) repeated x2
>                     ]))
>         ] UNSPEC_MOVMSK))

Both of these are obviously correct, they are simple substitutions.
If this does not do what you want, the problem is elsewhere, not in
combine, afaics?

> Is the above folding correct, since lt has V2DF mode,
> and casting -1 (literally) to DFmode would result in -NaN ?

Combine does not introduce any of that, it was there already.

> Also, should result of lt be having only integral modes ?

Apparently your machine description likes this fine.  Combine does not
ask questions.


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Possibly latent issue with combine ?
  2019-06-26 16:18 ` Segher Boessenkool
@ 2019-06-26 16:45   ` Richard Sandiford
  2019-06-26 17:14     ` Segher Boessenkool
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Sandiford @ 2019-06-26 16:45 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Prathamesh Kulkarni, GCC Development

Segher Boessenkool <segher@kernel.crashing.org> writes:
> On Wed, Jun 26, 2019 at 07:27:20PM +0530, Prathamesh Kulkarni wrote:
>> combine then does following combinations:
>> 
>> Trying 7 -> 9:
>>     7: r90:V2DI=r89:V2DI>r93:V2DI
>>       REG_DEAD r93:V2DI
>>       REG_DEAD r89:V2DI
>>     9: r91:V2DF=r90:V2DI#0
>>       REG_DEAD r90:V2DI
>> Successfully matched this instruction:
>> (set (subreg:V2DI (reg:V2DF 91) 0)
>>     (gt:V2DI (reg:V2DI 89)
>>         (reg:V2DI 93)))
>> allowing combination of insns 7 and 9
>> 
>> Trying 6, 9 -> 10:
>>     6: r89:V2DI=const_vector
>>     9: r91:V2DF#0=r89:V2DI>r93:V2DI
>>       REG_DEAD r89:V2DI
>>       REG_DEAD r93:V2DI
>>    10: r87:SI=unspec[r91:V2DF] 43
>>       REG_DEAD r91:V2DF
>> Successfully matched this instruction:
>> (set (reg:SI 87)
>>     (unspec:SI [
>>             (lt:V2DF (reg:V2DI 93)
>>                 (const_vector:V2DI [
>>                         (const_int 0 [0]) repeated x2
>>                     ]))
>>         ] UNSPEC_MOVMSK))
>
> Both of these are obviously correct, they are simple substitutions.
> If this does not do what you want, the problem is elsewhere, not in
> combine, afaics?

"Obviously" correct seems a stretch :-)  We can only fold:

  (subreg:V2DF (foo:V2DI X) 0)

to:

  (foo:V2DF X)

for certain operations.  E.g. it'd be wrong to do it for foo=plus.
IMO it's wrong for comparisons too.  A comparison between integers
that produces a floating-point result makes no sense, whatever the
target thinks about it.

>> Is the above folding correct, since lt has V2DF mode,
>> and casting -1 (literally) to DFmode would result in -NaN ?
>
> Combine does not introduce any of that, it was there already.

The original insns had an lt:V2DI between V2DI inputs and a V2DF
subreg of the result.  It's combine that turns that into a lt:V2DF
between V2DI inputs.

Richard

>> Also, should result of lt be having only integral modes ?
>
> Apparently your machine description likes this fine.  Combine does not
> ask questions.
>
>
> Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Possibly latent issue with combine ?
  2019-06-26 16:45   ` Richard Sandiford
@ 2019-06-26 17:14     ` Segher Boessenkool
  0 siblings, 0 replies; 4+ messages in thread
From: Segher Boessenkool @ 2019-06-26 17:14 UTC (permalink / raw)
  To: Prathamesh Kulkarni, GCC Development, richard.sandiford

On Wed, Jun 26, 2019 at 05:45:48PM +0100, Richard Sandiford wrote:
> "Obviously" correct seems a stretch :-)  We can only fold:
> 
>   (subreg:V2DF (foo:V2DI X) 0)
> 
> to:
> 
>   (foo:V2DF X)
> 
> for certain operations.
> 
> E.g. it'd be wrong to do it for foo=plus.

You would need to change X then, sure, so you cannot get that by doing a
simple substitution.  But this is lt, and it makes (structurally) perfect
sense here, the mode of lt does not depend on the mode of its args.  The
target should refuse it if it doesn't like it.  Simply by not having too
lenient patterns in the machine descriptions, probably.

> IMO it's wrong for comparisons too.  A comparison between integers
> that produces a floating-point result makes no sense, whatever the
> target thinks about it.

Then the target should not say it makes sense?

> >> Is the above folding correct, since lt has V2DF mode,
> >> and casting -1 (literally) to DFmode would result in -NaN ?
> >
> > Combine does not introduce any of that, it was there already.
> 
> The original insns had an lt:V2DI between V2DI inputs and a V2DF
> subreg of the result.  It's combine that turns that into a lt:V2DF
> between V2DI inputs.

Combine did only simple substitution as far as I can see.


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-06-26 17:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-26 13:58 Possibly latent issue with combine ? Prathamesh Kulkarni
2019-06-26 16:18 ` Segher Boessenkool
2019-06-26 16:45   ` Richard Sandiford
2019-06-26 17:14     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).