public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/106060] Inefficient constant broadcast on x86_64
Date: Tue, 07 May 2024 06:19:15 +0000	[thread overview]
Message-ID: <bug-106060-4-geeYGNyOF6@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-106060-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106060

--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:79649a5dcd81bc05c0ba591068c9075de43bd417

commit r15-222-g79649a5dcd81bc05c0ba591068c9075de43bd417
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Tue May 7 07:14:40 2024 +0100

    PR target/106060: Improved SSE vector constant materialization on x86.

    This patch resolves PR target/106060 by providing efficient methods for
    materializing/synthesizing special "vector" constants on x86.  Currently
    there are three methods of materializing a vector constant; the most
    general is to load a vector from the constant pool, secondly "duplicated"
    constants can be synthesized by moving an integer between units and
    broadcasting (of shuffling it), and finally the special cases of the
    all-zeros vector and all-ones vectors can be loaded via a single SSE
    instruction.   This patch handle additional cases that can be synthesized
    in two instructions, loading an all-ones vector followed by another SSE
    instruction.  Following my recent patch for PR target/112992, there's
    conveniently a single place in i386-expand.cc where these special cases
    can be handled.

    Two examples are given in the original bugzilla PR for 106060.

    __m256i should_be_cmpeq_abs ()
    {
      return _mm256_set1_epi8 (1);
    }

    is now generated (with -O3 -march=x86-64-v3) as:

            vpcmpeqd        %ymm0, %ymm0, %ymm0
            vpabsb  %ymm0, %ymm0
            ret

    and

    __m256i should_be_cmpeq_add ()
    {
      return _mm256_set1_epi8 (-2);
    }

    is now generated as:

            vpcmpeqd        %ymm0, %ymm0, %ymm0
            vpaddb  %ymm0, %ymm0, %ymm0
            ret

    2024-05-07  Roger Sayle  <roger@nextmovesoftware.com>
                Hongtao Liu  <hongtao.liu@intel.com>

    gcc/ChangeLog
            PR target/106060
            * config/i386/i386-expand.cc (enum ix86_vec_bcast_alg): New.
            (struct ix86_vec_bcast_map_simode_t): New type for table below.
            (ix86_vec_bcast_map_simode): Table of SImode constants that may
            be efficiently synthesized by a ix86_vec_bcast_alg method.
            (ix86_vec_bcast_map_simode_cmp): New comparator for bsearch.
            (ix86_vector_duplicate_simode_const): Efficiently synthesize
            V4SImode and V8SImode constants that duplicate special constants.
            (ix86_vector_duplicate_value): Attempt to synthesize "special"
            vector constants using ix86_vector_duplicate_simode_const.
            * config/i386/i386.cc (ix86_rtx_costs) <case ABS>: ABS of a
            vector integer mode costs with a single SSE instruction.

    gcc/testsuite/ChangeLog
            PR target/106060
            * gcc.target/i386/auto-init-8.c: Update test case.
            * gcc.target/i386/avx512fp16-13.c: Likewise.
            * gcc.target/i386/pr100865-9a.c: Likewise.
            * gcc.target/i386/pr101796-1.c: Likewise.
            * gcc.target/i386/pr106060-1.c: New test case.
            * gcc.target/i386/pr106060-2.c: Likewise.
            * gcc.target/i386/pr106060-3.c: Likewise.
            * gcc.target/i386/pr70314.c: Update test case.
            * gcc.target/i386/vect-shiftv4qi.c: Likewise.
            * gcc.target/i386/vect-shiftv8qi.c: Likewise.

  parent reply	other threads:[~2024-05-07  6:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-23  1:59 [Bug target/106060] New: " goldstein.w.n at gmail dot com
2022-06-23  7:05 ` [Bug target/106060] " crazylht at gmail dot com
2022-06-23 15:46 ` hjl.tools at gmail dot com
2023-05-17 23:15 ` pinskia at gcc dot gnu.org
2023-05-17 23:15 ` pinskia at gcc dot gnu.org
2024-01-14 11:57 ` roger at nextmovesoftware dot com
2024-02-16  9:41 ` roger at nextmovesoftware dot com
2024-05-07  6:19 ` cvs-commit at gcc dot gnu.org [this message]
2024-05-12  9:13 ` roger at nextmovesoftware dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-106060-4-geeYGNyOF6@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).