public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Possibly latent issue with combine ?
@ 2019-06-26 13:58 Prathamesh Kulkarni
  2019-06-26 16:18 ` Segher Boessenkool
  0 siblings, 1 reply; 4+ messages in thread
From: Prathamesh Kulkarni @ 2019-06-26 13:58 UTC (permalink / raw)
  To: GCC Development, Richard Sandiford

Hi,
For following test-case, taken from pr88152.C:

#include <x86intrin.h>

template <typename T, size_t N>
using V [[gnu::vector_size(N)]] = T;

int f10 (V<unsigned long long, 16> a)
{
  return _mm_movemask_pd (reinterpret_cast<__m128d> (a > __LONG_LONG_MAX__));
}

.optimized dump shows:

f10 (V a)
{
  vector(2) signed long _1;
  vector(2) long int _2;
  vector(2) double _3;
  int _6;

  <bb 2> [local count: 1073741824]:
  _1 = VIEW_CONVERT_EXPR<vector(2) signed long>(a_4(D));
  _2 = VEC_COND_EXPR <_1 < { 0, 0 }, { -1, -1 }, { 0, 0 }>;
  _3 = VIEW_CONVERT_EXPR<__m128d>(_2);
  _6 = __builtin_ia32_movmskpd (_3); [tail call]
  return _6;

}

IIUC, we're using -1 to represent true and 0 as false.
combine then does following combinations:

Trying 7 -> 9:
    7: r90:V2DI=r89:V2DI>r93:V2DI
      REG_DEAD r93:V2DI
      REG_DEAD r89:V2DI
    9: r91:V2DF=r90:V2DI#0
      REG_DEAD r90:V2DI
Successfully matched this instruction:
(set (subreg:V2DI (reg:V2DF 91) 0)
    (gt:V2DI (reg:V2DI 89)
        (reg:V2DI 93)))
allowing combination of insns 7 and 9

Trying 6, 9 -> 10:
    6: r89:V2DI=const_vector
    9: r91:V2DF#0=r89:V2DI>r93:V2DI
      REG_DEAD r89:V2DI
      REG_DEAD r93:V2DI
   10: r87:SI=unspec[r91:V2DF] 43
      REG_DEAD r91:V2DF
Successfully matched this instruction:
(set (reg:SI 87)
    (unspec:SI [
            (lt:V2DF (reg:V2DI 93)
                (const_vector:V2DI [
                        (const_int 0 [0]) repeated x2
                    ]))
        ] UNSPEC_MOVMSK))

Is the above folding correct, since lt has V2DF mode,
and casting -1 (literally) to DFmode would result in -NaN ?
Also, should result of lt be having only integral modes ?

split2 then folds insn 10 into:

(insn 22 9 16 2 (set (reg:SI 0 ax [87])
        (unspec:SI [
                (reg:V2DF 20 xmm0 [93])
            ] UNSPEC_MOVMSK))
"../../stage1-build/gcc/include/emmintrin.h":958:34 4222
{sse2_movmskpd}
     (nil))

deleting insn 10.

The issue is my patch for PR88833 results in following propagation in forwprop1:

In insn 10, replacing
 (unspec:SI [
            (reg:V2DF 91)
        ] UNSPEC_MOVMSK)
 with (unspec:SI [
            (subreg:V2DF (reg:V2DI 90) 0)
        ] UNSPEC_MOVMSK)

deleting insn 9 and this inhibits the above combinations,
resulting in failure of PR88152.C

With patch, combine shows:
Trying 7 -> 10:
    7: r90:V2DI=r89:V2DI>r93:V2DI
      REG_DEAD r93:V2DI
      REG_DEAD r89:V2DI
   10: r87:SI=unspec[r90:V2DI#0] 43
      REG_DEAD r90:V2DI
Failed to match this instruction:
(set (reg:SI 87)
    (unspec:SI [
            (subreg:V2DF (gt:V2DI (reg:V2DI 89)
                    (reg:V2DI 93)) 0)
        ] UNSPEC_MOVMSK))

and subsequently fails to match 6, 7 -> 10

Patch:
http://people.linaro.org/~prathamesh.kulkarni/pr88833-10.diff

Upstream discussion about the issue:
https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01651.html

Thanks,
Prathamesh

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-06-26 17:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-26 13:58 Possibly latent issue with combine ? Prathamesh Kulkarni
2019-06-26 16:18 ` Segher Boessenkool
2019-06-26 16:45   ` Richard Sandiford
2019-06-26 17:14     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).