From: Uros Bizjak <ubizjak@gmail.com>
To: Roger Sayle <roger@nextmovesoftware.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [x86 PATCH] Add cbranchti4 pattern to i386.md (for -m32 compare_by_pieces).
Date: Tue, 27 Jun 2023 22:20:12 +0200 [thread overview]
Message-ID: <CAFULd4YW6bQGS6AfcHTapFRjbCr6qN5CxQ8-nvBZ_naTYezZzA@mail.gmail.com> (raw)
In-Reply-To: <013101d9a91b$eb84cb60$c28e6220$@nextmovesoftware.com>
On Tue, Jun 27, 2023 at 7:22 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch fixes some very odd (unanticipated) code generation by
> compare_by_pieces with -m32 -mavx, since the recent addition of the
> cbranchoi4 pattern. The issue is that cbranchoi4 is available with
> TARGET_AVX, but cbranchti4 is currently conditional on TARGET_64BIT
> which results in the odd behaviour (thanks to OPTAB_WIDEN) that with
> -m32 -mavx, compare_by_pieces ends up (inefficiently) widening 128-bit
> comparisons to 256-bits before performing PTEST.
>
> This patch fixes this by providing a cbranchti4 pattern that's available
> with either TARGET_64BIT or TARGET_SSE4_1.
>
> For the test case below (again from PR 104610):
>
> int foo(char *a)
> {
> static const char t[] = "0123456789012345678901234567890";
> return __builtin_memcmp(a, &t[0], sizeof(t)) == 0;
> }
>
> GCC with -m32 -O2 -mavx currently produces the bonkers:
>
> foo: pushl %ebp
> movl %esp, %ebp
> andl $-32, %esp
> subl $64, %esp
> movl 8(%ebp), %eax
> vmovdqa .LC0, %xmm4
> movl $0, 48(%esp)
> vmovdqu (%eax), %xmm2
> movl $0, 52(%esp)
> movl $0, 56(%esp)
> movl $0, 60(%esp)
> movl $0, 16(%esp)
> movl $0, 20(%esp)
> movl $0, 24(%esp)
> movl $0, 28(%esp)
> vmovdqa %xmm2, 32(%esp)
> vmovdqa %xmm4, (%esp)
> vmovdqa (%esp), %ymm5
> vpxor 32(%esp), %ymm5, %ymm0
> vptest %ymm0, %ymm0
> jne .L2
> vmovdqu 16(%eax), %xmm7
> movl $0, 48(%esp)
> movl $0, 52(%esp)
> vmovdqa %xmm7, 32(%esp)
> vmovdqa .LC1, %xmm7
> movl $0, 56(%esp)
> movl $0, 60(%esp)
> movl $0, 16(%esp)
> movl $0, 20(%esp)
> movl $0, 24(%esp)
> movl $0, 28(%esp)
> vmovdqa %xmm7, (%esp)
> vmovdqa (%esp), %ymm1
> vpxor 32(%esp), %ymm1, %ymm0
> vptest %ymm0, %ymm0
> je .L6
> .L2: movl $1, %eax
> xorl $1, %eax
> vzeroupper
> leave
> ret
> .L6: xorl %eax, %eax
> xorl $1, %eax
> vzeroupper
> leave
> ret
>
> with this patch, we now generate the (slightly) more sensible:
>
> foo: vmovdqa .LC0, %xmm0
> movl 4(%esp), %eax
> vpxor (%eax), %xmm0, %xmm0
> vptest %xmm0, %xmm0
> jne .L2
> vmovdqa .LC1, %xmm0
> vpxor 16(%eax), %xmm0, %xmm0
> vptest %xmm0, %xmm0
> je .L5
> .L2: movl $1, %eax
> xorl $1, %eax
> ret
> .L5: xorl %eax, %eax
> xorl $1, %eax
> ret
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures. Ok for mainline?
>
>
> 2023-06-27 Roger Sayle <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
> * config/i386/i386-expand.cc (ix86_expand_branch): Also use ptest
> for TImode comparisons on 32-bit architectures.
> * config/i386/i386.md (cbranch<mode>4): Change from SDWIM to
> SWIM1248x to exclude/avoid TImode being conditional on -m64.
> (cbranchti4): New define_expand for TImode on both TARGET_64BIT
> and/or with TARGET_SSE4_1.
> * config/i386/predicates.md (ix86_timode_comparison_operator):
> New predicate that depends upon TARGET_64BIT.
> (ix86_timode_comparison_operand): Likewise.
>
> gcc/testsuite/ChangeLog
> * gcc.target/i386/pieces-memcmp-2.c: New test case.
OK with a small fix.
Thanks,
Uros.
+;; Return true if this is a valid second operand for a TImode comparison.
+(define_predicate "ix86_timode_comparison_operand"
+ (if_then_else (match_test "TARGET_64BIT")
+ (match_operand 0 "x86_64_general_operand")
+ (match_operand 0 "nonimmediate_operand")))
+
+
Please remove the duplicate blank line above.
prev parent reply other threads:[~2023-06-27 20:20 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-27 17:22 Roger Sayle
2023-06-27 20:20 ` Uros Bizjak [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFULd4YW6bQGS6AfcHTapFRjbCr6qN5CxQ8-nvBZ_naTYezZzA@mail.gmail.com \
--to=ubizjak@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=roger@nextmovesoftware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).