From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id C03F83858408; Mon,  4 Oct 2021 12:12:24 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C03F83858408
From: "gabravier at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/102591] New: Failure to optimize search for value in
 vector-sized area to use SIMD
Date: Mon, 04 Oct 2021 12:12:24 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: gabravier at gmail dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status
 bug_severity priority component assigned_to reporter target_milestone
Message-ID: <bug-102591-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Oct 2021 12:12:24 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D102591

            Bug ID: 102591
           Summary: Failure to optimize search for value in vector-sized
                    area to use SIMD
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

bool match8(char *tpl)=20
{
    int found =3D 0;
    for (int at =3D 0; at < 16; at++)
        if (tpl[at] =3D=3D 0)
            found =3D 1;
    return found;
}

This function can be greatly optimized by using SIMD. It can be optimized to
something like this:

typedef char v16i8 __attribute__((vector_size(16)));

bool match8v2(char *tpl)
{
    v16i8 values;
    __builtin_memcpy(&values, tpl, 16);
    v16i8 compared =3D (values =3D=3D 0);
    return _mm_movemask_epi8((__m128i)compared) !=3D 0;
}

This optimization is done by LLVM, but not by GCC.

PS: I've marked this as an x86 bug, but only because I could not find a
portable way of expressing `_mm_movemask_epi8((__m128i)compared)`, I would
assume other architectures have similar ways of expressing the same thing
cheaply.

(For example, Altivec should be able to implement that operation with a
`vec_extract(vec_vbpermq((__vector unsigned char)compared, perm), 1)` with
`perm` looking like this: `{120, 112, 104, 96, 88, 80, 72, 64, 56, 48, 40, =
32,
24, 16, 8, 0}` and the 1 replaced with 14 on big-endian)=