From: Uros Bizjak <ubizjak@gmail.com>
To: Roger Sayle <roger@nextmovesoftware.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [x86 PATCH] PR target/70321: Split double word equality/inequality after STV.
Date: Wed, 13 Apr 2022 20:51:04 +0200 [thread overview]
Message-ID: <CAFULd4a+hSD5TyjwgaH+0f2fupG+0f5zMWoO2z17hZeDDXvcKA@mail.gmail.com> (raw)
In-Reply-To: <001f01d84f15$52729c60$f757d520$@nextmovesoftware.com>
On Wed, Apr 13, 2022 at 11:03 AM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch resolves the last piece of PR target/70321 a code quality
> (P2 regression) affecting mainline. Currently, for HJ's testcase:
>
> void foo (long long ixi)
> {
> if (ixi != 14348907)
> __builtin_abort ();
> }
>
> GCC with -m32 -O2 generates four instructions for the comparison:
>
> movl 16(%esp), %eax
> movl 20(%esp), %edx
> xorl $14348907, %eax
> orl %eax, %edx
>
> but with this patch it now requires only three, making better use of
> x86's addressing modes:
>
> movl 16(%esp), %eax
> xorl $14348907, %eax
> orl 20(%esp), %eax
>
> The solution is to expand "doubleword" equality/inequality expressions
> using flag setting COMPARE instructions for the early RTL passes, and
> then split them during split1, after STV and before reload.
> Hence on x86_64, we now see/allow things like:
>
> (insn 11 8 12 2 (set (reg:CCZ 17 flags)
> (compare:CCZ (reg/v:TI 84 [ x ])
> (reg:TI 96))) "cmpti.c":2:43 30 {*cmpti_doubleword}
>
> This allows the STV pass to decide whether it's preferrable to perform
> this comparison using vector operations, i.e. a pxor/ptest sequence,
> or as scalar integer operations, i.e. a xor/xor/or sequence. Alas
> this required tweaking of the STV pass to recognize the "new" form of
> these comparisons and split out the pxor operation itself. To confirm
> this still works as expected I've added a new STV test case:
>
> long long a[1024];
> long long b[1024];
>
> int foo()
> {
> for (int i=0; i<1024; i++)
> {
> long long t = (a[i]<<8) | (b[i]<<24);
> if (t == 0)
> return 1;
> }
> return 0;
> }
>
> where with -m32 -O2 -msse4.1 the above comparison with zero should look
> like:
>
> punpcklqdq %xmm0, %xmm0
> ptest %xmm0, %xmm0
>
> Although this patch includes one or two minor tweaks to provide all the
> necessary infrastructure to support conversion of TImode comparisons to
> V1TImode (and SImode comparisons to V4SImode), STV doesn't yet implement
> these transformations, but this is something that can be considered after
> stage 4. Indeed the new convert_compare functionality is split out
> into a method to simplify its potential reuse by the timode_scalar_chain
> class.
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures for -m64, and two (missed-optimization) FAILs on
> -m32 (both around pandn). The first gcc.target/i386/pr65105-5.c is
> resolved by the patch I posted yesterday, the other i386/pr78794.c is
> easily fixed by tweaking STV's "gain" estimates, but it's preferrable to
> generate better code for both the scalar and vector implementations
> before tweaking these parameters (patches to follow shortly).
>
> Ok for mainline?
Unless there is a really compelling reason that warrants committing
this patch in stage 4, I'd propose to postpone it to next stage 1. The
patch is quite involved and touches a somewhat complex part of the
compiler (STV). Also, looking at the above description, the patch
introduces new functionality that represents only a part of the
complete solution, and as such opens a door for possible regressions,
which are not welcomed in stage 4 at all.
Uros.
>
> 2022-04-13 Roger Sayle <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
> PR target/70321
> * config/i386/i386-expand.cc (ix86_expand_branch): Don't decompose
> DI mode equality/inequality using XOR here. Instead generate a
> COMPARE for doubleword modes (DImode on !TARGET_64BIT or TImode).
> * config/i386/i386-features.cc (gen_gpr_to_xmm_move_src): Use
> gen_rtx_SUBREG when NUNITS is 1, i.e. for TImode to V1TImode.
> (general_scalar_chain::convert_compare): New function to convert
> scalar equality/inequality comparison into vector operations.
> (general_scalar_chain::convert_insn) [COMPARE]: Refactor. Call
> new convert_compare helper method.
> (convertible_comparion_p): Update to match doubleword COMPARE
> of two register, memory or integer constant operands.
> * config/i386/i386-features.h
> (general_scalar_chain::convert_compare):
> Prototype/declare member function here.
> * config/i386/i386.md (cstore<mode>4): Change mode to SDWIM, but
> only allow new doubleword modes for EQ and NE operators.
> (*cmp<dwi>_doubleword): New define_insn_and_split, to split a
> doubleword comparison into a pair of XORs followed by an IOR to
> set the (zero) flags register, optimizing the XORs if possible.
> * config/i386/sse.md (V_AVX): Include V1TI and V2TI in mode
> iterator;
> V_AVX is (currently) only used by ptest.
> (sse4_1 mode attribute): Update to support V1TI and V2TI.
>
> gcc/testsuite/ChangeLog
> PR target/70321
> * gcc.target/i386/pr70321.c: New test case.
> * gcc.target/i386/sse4_1-stv-1.c: New test case.
>
>
> Thanks in advance,
> Roger
> --
>
prev parent reply other threads:[~2022-04-13 18:51 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-13 9:03 Roger Sayle
2022-04-13 18:51 ` Uros Bizjak [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFULd4a+hSD5TyjwgaH+0f2fupG+0f5zMWoO2z17hZeDDXvcKA@mail.gmail.com \
--to=ubizjak@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=roger@nextmovesoftware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).