public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* VN, len_store and endianness
@ 2022-09-26 14:21 Robin Dapp
  2022-09-27  8:39 ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Robin Dapp @ 2022-09-26 14:21 UTC (permalink / raw)
  To: GCC Patches, Richard Biener, jakub

Hi,

I'm locally testing a branch that enables vll/vstl for partial vector
usage i.e. len_load and len_store on s390.  I see a FAIL in
testsuite/gfortran.dg/power_3.f90.
Since r13-1777-gbd9837bc3ca134 we also performe VN for masked/len stores
and things go wrong there.  The problem seems to be that we evaluate a
vector constant {-1, 1, -1, 1} loaded with length 11 + 1(bias) = 12 as
{1, -1, 1} instead of {-1, 1, -1}.

I found it a bit difficult to navigate through the logic due to several
sizes, offsets, lengths and "amounts" :)  From what I can tell the
culprit code is (guarded by BYTES_BIG_ENDIAN)

   if (TREE_CODE (pd.rhs) != CONSTRUCTOR)
     {
         q = (this_buffer + len
              - (ROUND_UP (size - amnt, BITS_PER_UNIT)
                 / BITS_PER_UNIT));
     }

where, with pd.rhs = { 255, 255, 255, 255, 0, 0, 0, 1, 255, 255, 255,
255, 0, 0, 0, 1 }, len = 16 bytes, size = 96 bits, we read after the
first 32 bits.  What is supposed to happen here?  It looks like going
backwards (when size grows), but actually size shrinks for my example
with each successive element via pd.offset 0, -32 and -64.

When skipping the block with && TREE_CODE (pd.rhs) != VECTOR_CST the
test and various others succeed but I didn't pursue testing further and
figured I'd rather ask here for more insight.

Regards
 Robin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: VN, len_store and endianness
  2022-09-26 14:21 VN, len_store and endianness Robin Dapp
@ 2022-09-27  8:39 ` Richard Biener
  2022-09-27 13:19   ` Robin Dapp
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2022-09-27  8:39 UTC (permalink / raw)
  To: Robin Dapp; +Cc: GCC Patches, jakub

On Mon, Sep 26, 2022 at 4:21 PM Robin Dapp <rdapp@linux.ibm.com> wrote:
>
> Hi,
>
> I'm locally testing a branch that enables vll/vstl for partial vector
> usage i.e. len_load and len_store on s390.  I see a FAIL in
> testsuite/gfortran.dg/power_3.f90.
> Since r13-1777-gbd9837bc3ca134 we also performe VN for masked/len stores
> and things go wrong there.  The problem seems to be that we evaluate a
> vector constant {-1, 1, -1, 1} loaded with length 11 + 1(bias) = 12 as
> {1, -1, 1} instead of {-1, 1, -1}.
>
> I found it a bit difficult to navigate through the logic due to several
> sizes, offsets, lengths and "amounts" :)  From what I can tell the
> culprit code is (guarded by BYTES_BIG_ENDIAN)
>
>    if (TREE_CODE (pd.rhs) != CONSTRUCTOR)
>      {
>          q = (this_buffer + len
>               - (ROUND_UP (size - amnt, BITS_PER_UNIT)
>                  / BITS_PER_UNIT));
>      }
>
> where, with pd.rhs = { 255, 255, 255, 255, 0, 0, 0, 1, 255, 255, 255,
> 255, 0, 0, 0, 1 }, len = 16 bytes, size = 96 bits, we read after the
> first 32 bits.  What is supposed to happen here?  It looks like going
> backwards (when size grows), but actually size shrinks for my example
> with each successive element via pd.offset 0, -32 and -64.
>
> When skipping the block with && TREE_CODE (pd.rhs) != VECTOR_CST the
> test and various others succeed but I didn't pursue testing further and
> figured I'd rather ask here for more insight.

The error is probably in vn_reference_lookup_3 which assumes that
'len' applies to the vector elements in element order.  See the part
of the code where it checks for internal_store_fn_p.  If 'len' is with
respect to the memory and thus endianess has to be taken into
account then for the IFN_LEN_STORE

              else if (fn == IFN_LEN_STORE)
                {
                  pd.rhs_off = 0;
                  pd.offset = offset2i;
                  pd.size = (tree_to_uhwi (len)
                             + -tree_to_shwi (bias)) * BITS_PER_UNIT;
                  if (ranges_known_overlap_p (offset, maxsize,
                                              pd.offset, pd.size))
                    return data->push_partial_def (pd, set, set,
                                                   offseti, maxsizei);

likely needs to adjust rhs_off from zero for big endian?

>
> Regards
>  Robin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: VN, len_store and endianness
  2022-09-27  8:39 ` Richard Biener
@ 2022-09-27 13:19   ` Robin Dapp
  2022-09-27 13:49     ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Robin Dapp @ 2022-09-27 13:19 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, jakub

> The error is probably in vn_reference_lookup_3 which assumes that
> 'len' applies to the vector elements in element order.  See the part
> of the code where it checks for internal_store_fn_p.  If 'len' is with
> respect to the memory and thus endianess has to be taken into
> account then for the IFN_LEN_STORE
> 
>               else if (fn == IFN_LEN_STORE)
>                 {
>                   pd.rhs_off = 0;
>                   pd.offset = offset2i;
>                   pd.size = (tree_to_uhwi (len)
>                              + -tree_to_shwi (bias)) * BITS_PER_UNIT;
>                   if (ranges_known_overlap_p (offset, maxsize,
>                                               pd.offset, pd.size))
>                     return data->push_partial_def (pd, set, set,
>                                                    offseti, maxsizei);
> 
> likely needs to adjust rhs_off from zero for big endian?

Not sure I follow entirely.  rhs_off only seems to be used for
native_encode_expr which properly encodes already ({-1, 1, -1, 1} in
that order in memory).  A 'len' of 12 is the first three elements (in
the same order or element order as well).

If the constant were encoded in little endian ({1, -1, 1, -1}) 'q' would
kind of address the right elements (using always the second, or
"reversed third" element while shifting the buffer by 4 bytes each time).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: VN, len_store and endianness
  2022-09-27 13:19   ` Robin Dapp
@ 2022-09-27 13:49     ` Richard Biener
  2022-09-27 13:59       ` Robin Dapp
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2022-09-27 13:49 UTC (permalink / raw)
  To: Robin Dapp; +Cc: GCC Patches, jakub

On Tue, Sep 27, 2022 at 3:19 PM Robin Dapp <rdapp@linux.ibm.com> wrote:
>
> > The error is probably in vn_reference_lookup_3 which assumes that
> > 'len' applies to the vector elements in element order.  See the part
> > of the code where it checks for internal_store_fn_p.  If 'len' is with
> > respect to the memory and thus endianess has to be taken into
> > account then for the IFN_LEN_STORE
> >
> >               else if (fn == IFN_LEN_STORE)
> >                 {
> >                   pd.rhs_off = 0;
> >                   pd.offset = offset2i;
> >                   pd.size = (tree_to_uhwi (len)
> >                              + -tree_to_shwi (bias)) * BITS_PER_UNIT;
> >                   if (ranges_known_overlap_p (offset, maxsize,
> >                                               pd.offset, pd.size))
> >                     return data->push_partial_def (pd, set, set,
> >                                                    offseti, maxsizei);
> >
> > likely needs to adjust rhs_off from zero for big endian?
>
> Not sure I follow entirely.  rhs_off only seems to be used for
> native_encode_expr which properly encodes already ({-1, 1, -1, 1} in
> that order in memory).  A 'len' of 12 is the first three elements (in
> the same order or element order as well).

Yes, because the native_interpret always starts at offset zero
(we can't easily feed in a "shifted" RHS).  So what I assumed is
that IFN_LEN_STORE always stores elements [0, len + adj].

> If the constant were encoded in little endian ({1, -1, 1, -1}) 'q' would
> kind of address the right elements (using always the second, or
> "reversed third" element while shifting the buffer by 4 bytes each time).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: VN, len_store and endianness
  2022-09-27 13:49     ` Richard Biener
@ 2022-09-27 13:59       ` Robin Dapp
  2022-09-29  7:32         ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Robin Dapp @ 2022-09-27 13:59 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, jakub

> Yes, because the native_interpret always starts at offset zero
> (we can't easily feed in a "shifted" RHS).  So what I assumed is
> that IFN_LEN_STORE always stores elements [0, len + adj].

Hmm, but this assumption is not violated here or am I missing something?
 It's not like we're storing [vec_size - (len + adj) - 1, vec_size - 1]
but indeed [0, len + adj].  Just the access to the buffer later is
reversed which it should not be.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: VN, len_store and endianness
  2022-09-27 13:59       ` Robin Dapp
@ 2022-09-29  7:32         ` Richard Biener
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Biener @ 2022-09-29  7:32 UTC (permalink / raw)
  To: Robin Dapp; +Cc: GCC Patches, jakub

On Tue, Sep 27, 2022 at 3:59 PM Robin Dapp <rdapp@linux.ibm.com> wrote:
>
> > Yes, because the native_interpret always starts at offset zero
> > (we can't easily feed in a "shifted" RHS).  So what I assumed is
> > that IFN_LEN_STORE always stores elements [0, len + adj].
>
> Hmm, but this assumption is not violated here or am I missing something?
>  It's not like we're storing [vec_size - (len + adj) - 1, vec_size - 1]
> but indeed [0, len + adj].  Just the access to the buffer later is
> reversed which it should not be.

But rhs_off is applied to the buffer "load" IIRC.  At least somewhere
there's an inconsistency and since rhs_off is new I bet that's where
it is ...

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-09-29  7:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-26 14:21 VN, len_store and endianness Robin Dapp
2022-09-27  8:39 ` Richard Biener
2022-09-27 13:19   ` Robin Dapp
2022-09-27 13:49     ` Richard Biener
2022-09-27 13:59       ` Robin Dapp
2022-09-29  7:32         ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).