From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=IgfL=BU=gmail.com=richard.guenther@sourceware.org>
Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129])
	by sourceware.org (Postfix) with ESMTPS id 94AFA3858D20
	for <gcc-patches@gcc.gnu.org>; Wed, 31 May 2023 07:14:45 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 94AFA3858D20
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-4f3b9e54338so6406325e87.0
        for <gcc-patches@gcc.gnu.org>; Wed, 31 May 2023 00:14:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1685517283; x=1688109283;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=E9yrJeB9u71CjH7lQon7zQQwBUJ3/H1ClRT3alg/4dQ=;
        b=Qmeek05LoJpEs9UOgckr+3+dXD8qzO/AEf7tmYeROhcWoBHENcgfgbzh3zOL6/Yxpz
         psdofja+L1tD0To5WzBnE/urZt2joK1GkNbQu0By2C+NjGFyCu3o4n/M93urhBi2bdmu
         KHIFjqBH+GxLdDLer3Ve6UphAPmxov66jD+e/CgTWmkRufNwCOVtRMzxRmCUfr3rChgj
         XFxGft3Uv3WzOrW2rKJkhhX1t/N0o7zMm/1XAegf3VrzF/CZdVjwNvNKvGK1QJ19NgrP
         yQUcFzv87FBGq8V3hqC9jnHQab+Q5p0PoD9CZJcSb+V7rSoN7oWlhbm9wEkFXCrYd9gl
         ryrQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1685517283; x=1688109283;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=E9yrJeB9u71CjH7lQon7zQQwBUJ3/H1ClRT3alg/4dQ=;
        b=UVLXpuWvwsYdlavBcv4hQlR3XpRqdR/vrQqLMxOxWNpiSnt4grz1cZEAAd8s6lhDok
         hW7bqoo/WBc8WjJ89BHUU3t0yjj3EfJ+2fFIDHuJ795ljhZw9jy2xTwyCo03z+GuGNwn
         JH79FS2gmSwTWIJ0z6flpDcik+QHLwrVyCIELFWatkXrPef0NGeHgshd/D+Ji1wiblKw
         4qk8/BTOJXMmYdJZrzYkfPBzjNCOYWfhOFwR+H/0+wtrI5NKV396Z9Yy6EGjlIjudvuk
         bLxIVi/da2Lsw8BwMqkGDjwuocGjRx55QQRzsvCUUS4Nequ6EZfG4DZxP5F1PGsfChKC
         wcVg==
X-Gm-Message-State: AC+VfDyWAK12+KVSrB5whyJI04saAfRVlHKtflYxFJx0MsgGc4LT38HD
	Pg/b7GTlo8N0I7K24qem6RlUyaEys+HimHlM7QHFR19Y
X-Google-Smtp-Source: ACHHUZ6HJAkqR/thsRsRR2sCa2b8y/s8GWwxrvpZyA1sxuGZjWBG0WMsmmJILgHebxqJL2YW+78NViPbUf4X8d8fGaY=
X-Received: by 2002:a2e:98c7:0:b0:2af:23fe:98ef with SMTP id
 s7-20020a2e98c7000000b002af23fe98efmr1956801ljj.50.1685517282925; Wed, 31 May
 2023 00:14:42 -0700 (PDT)
MIME-Version: 1.0
References: <CCFDAE94-CF4F-41CB-A001-978B839B0CBF@gmail.com>
 <4b477e8c-f8e6-5587-23ed-540f0c28ea05@ispras.ru> <CAFiYyc39GdcPegXWj4_FJ_XCKtK73ytmGgum_HYnU=V2q3imAw@mail.gmail.com>
 <8f166a4d-ec86-a95d-01d0-c381631da1df@ispras.ru>
In-Reply-To: <8f166a4d-ec86-a95d-01d0-c381631da1df@ispras.ru>
From: Richard Biener <richard.guenther@gmail.com>
Date: Wed, 31 May 2023 09:12:29 +0200
Message-ID: <CAFiYyc1vXNQto94SCS0pSQ_4-rx3oE+tfYo59DjEqLjdE5U_nA@mail.gmail.com>
Subject: Re: [PATCH] doc: clarify semantics of vector bitwise shifts
To: Alexander Monakov <amonakov@ispras.ru>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Tue, May 30, 2023 at 4:49=E2=80=AFPM Alexander Monakov <amonakov@ispras.=
ru> wrote:
>
>
> On Thu, 25 May 2023, Richard Biener wrote:
>
> > On Wed, May 24, 2023 at 8:36=E2=80=AFPM Alexander Monakov <amonakov@isp=
ras.ru> wrote:
> > >
> > >
> > > On Wed, 24 May 2023, Richard Biener via Gcc-patches wrote:
> > >
> > > > I=E2=80=99d have to check the ISAs what they actually do here - it =
of course depends
> > > > on RTL semantics as well but as you say those are not strictly defi=
ned here
> > > > either.
> > >
> > > Plus, we can add the following executable test to the testsuite:
> >
> > Yeah, that's probably a good idea.  I think your documentation change
> > with the added sentence about the truncation is OK.
>
> I am no longer confident in my patch, sorry.
>
> My claim about vector shift semantics in OpenCL was wrong. In fact it spe=
cifies
> that RHS of a vector shift is masked to the exact bitwidth of the element=
 type.
>
> So, to collect various angles:
>
> 1. OpenCL semantics would need an 'AND' before a shift (except VSX/Altive=
c).
>
> 2. From user side we had a request to follow C integer promotion semantic=
s
>    in https://gcc.gnu.org/PR91838 but I now doubt we can do that.
>
> 3. LLVM makes oversized vector shifts UB both for 'vector_size' and
>    'ext_vector_type'.

I had the impression GCC desired to do 3. as well, matching what we do
for scalar shifts.

> 4. Vector lowering does not emit promotions, and starting from gcc-12
>    ranger treats oversized shifts according to the documentation you
>    cite below, and optimizes (e.g. with '-O2 -mno-sse')
>
>         typedef short v8hi __attribute__((vector_size(16)));
>
>         void f(v8hi *p)
>         {
>             *p >>=3D 16;
>         }
>
>    to zeroing '*p'. If this looks unintended, I can file a bug.
>
> I still think we need to clarify semantics of vector shifts, but probably
> not in the way I proposed initially. What do you think?

I think the intent at some point was to adhere to the OpenCL spec
for the GCC vector extension (because that's a written spec while
GCCs vector extension docs are lacking).  Originally the powerpc
altivec 'vector' keyword spurred most of the development IIRC
so it might be useful to see how they specify shifts.

So yes, we probably should clarify the semantics to match the
implementation (since we have two targets doing things differently
since forever we can only document it as UB) and also note the
difference from OpenCL (in case OpenCL is still relevant these
days we might want to offer a -fopencl-vectors to emit the required
AND).

It would be also good to amend the RTL documentation.

It would be very nice to start an internals documentation section
around collecting what the middle-end considers undefined
or implementation defined (aka target defined) behavior in the
GENERIC, GIMPLE and RTL ILs and what predicates eventually
control that (like TYPE_OVERFLOW_UNDEFINED).  Maybe spread it over
{gimple,generic,rtl}.texi, though gimple.texi is only about the representat=
ion
and all semantics are shared and documented in generic.texi.

Thanks,
Richard.

> Thanks.
> Alexander
>
> > Note we have
> >
> > /* Shift operations for shift and rotate.
> >    Shift means logical shift if done on an
> >    unsigned type, arithmetic shift if done on a signed type.
> >    The second operand is the number of bits to
> >    shift by; it need not be the same type as the first operand and resu=
lt.
> >    Note that the result is undefined if the second operand is larger
> >    than or equal to the first operand's type size.
> >
> >    The first operand of a shift can have either an integer or a
> >    (non-integer) fixed-point type.  We follow the ISO/IEC TR 18037:2004
> >    semantics for the latter.
> >
> >    Rotates are defined for integer types only.  */
> > DEFTREECODE (LSHIFT_EXPR, "lshift_expr", tcc_binary, 2)
> >
> > in tree.def which implies short << 24 is undefined behavior (similar
> > wording in generic.texi).  The rtl docs say nothing about behavior
> > but I think the semantics should carry over.  That works for x86
> > even for scalar instructions working on GPRs (masking is applied
> > but fixed to 5 or 6 bits even for QImode or HImode shifts).
> >
> > Note that when we make these shifts well-defined there's
> > also arithmetic on signed types smaller than int (which again
> > doesn't exist in C) where overflow invokes undefined behavior
> > in the middle-end.  Unless we want to change that as well
> > this is somewhat inconsistent then.
> >
> > There's also the issue that C 'int' is defined by INT_TYPE_SIZE
> > and thus target dependent which makes what is undefined and
> > what not target dependent.
> >
> > Richard.
> >
> > > #include <stdint.h>
> > >
> > > #define CHECK(TYPE, WIDTH, OP, COUNT, INVERT)         \
> > > {                                                     \
> > > typedef TYPE vec __attribute__((vector_size(WIDTH))); \
> > >                                                       \
> > >         static volatile vec zero;                     \
> > >         vec tmp =3D (zero-2) OP (COUNT);                \
> > >         vec ref =3D INVERT zero;                        \
> > >         if (__builtin_memcmp(&tmp, &ref, sizeof tmp)) \
> > >                 __builtin_abort();                    \
> > > }
> > >
> > > int main(void)
> > > {
> > >         CHECK( uint8_t, 16, <<, 8,  )
> > >         CHECK( uint8_t, 16, <<, 31, )
> > >         CHECK( uint8_t, 16, >>, 8,  )
> > >         CHECK( uint8_t, 16, >>, 31, )
> > >         CHECK(  int8_t, 16, <<, 8,  )
> > >         CHECK(  int8_t, 16, <<, 31, )
> > >         CHECK(  int8_t, 16, >>, 8,  ~)
> > >         CHECK(  int8_t, 16, >>, 31, ~)
> > >         CHECK(uint16_t, 16, <<, 16, )
> > >         CHECK(uint16_t, 16, <<, 31, )
> > >         CHECK(uint16_t, 16, >>, 16, )
> > >         CHECK(uint16_t, 16, >>, 31, )
> > >         CHECK( int16_t, 16, <<, 16, )
> > >         CHECK( int16_t, 16, <<, 31, )
> > >         CHECK( int16_t, 16, >>, 16, ~)
> > >         CHECK( int16_t, 16, >>, 31, ~)
> > >         // Per-lane-variable shifts:
> > >         CHECK( uint8_t, 16, <<, zero+8,  )
> > >         CHECK( uint8_t, 16, <<, zero+31, )
> > >         CHECK( uint8_t, 16, >>, zero+8,  )
> > >         CHECK( uint8_t, 16, >>, zero+31, )
> > >         CHECK(  int8_t, 16, <<, zero+8,  )
> > >         CHECK(  int8_t, 16, <<, zero+31, )
> > >         CHECK(  int8_t, 16, >>, zero+8,  ~)
> > >         CHECK(  int8_t, 16, >>, zero+31, ~)
> > >         CHECK(uint16_t, 16, <<, zero+16, )
> > >         CHECK(uint16_t, 16, <<, zero+31, )
> > >         CHECK(uint16_t, 16, >>, zero+16, )
> > >         CHECK(uint16_t, 16, >>, zero+31, )
> > >         CHECK( int16_t, 16, <<, zero+16, )
> > >         CHECK( int16_t, 16, <<, zero+31, )
> > >         CHECK( int16_t, 16, >>, zero+16, ~)
> > >         CHECK( int16_t, 16, >>, zero+31, ~)
> > >
> > >         // Repeat for WIDTH=3D32 and WIDTH=3D64
> > > }
> > >
> > > Alexander
> >