From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by sourceware.org (Postfix) with ESMTPS id 84EF53858D28 for ; Tue, 27 Jun 2023 20:20:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 84EF53858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-qk1-x72f.google.com with SMTP id af79cd13be357-76547539718so361476785a.2 for ; Tue, 27 Jun 2023 13:20:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687897224; x=1690489224; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4sNNDqRhOsJXaACzCd5BQQzx+ENEHGsOvAtXzR+uw/w=; b=oKD9t+ZRK4hsOxwQ2HriDnLqWHEo4aj5jnXZhssiVuxNdTu0zZbp+YQwByqcqbZHlr BsQDcFk3QD0h2oG82A/OFsERlDFs7xK5hDVn0xpmxYKKXCbU7a7FJlY33DlKMTNUH7aI bNC3w6vj4olEWHj1nJYOPnp8WYaR7xp2JVzq+wY76MwIU031/7i4YILHNai2dUhEIPGp vflH3qM2/7fTPM0gl1sKEcrDcR7Pl75PgFu9hOKDNU7qE4SQVJUsUg3eHsVrAtWoXWi7 BZHoG7QJzmgCETqT5LdjchFaOoQvLKGyalPH+Mi/TE/AmIiOhYKTxH/TsW8tq6fG+F5z vs4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687897224; x=1690489224; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4sNNDqRhOsJXaACzCd5BQQzx+ENEHGsOvAtXzR+uw/w=; b=fsFPJcWO6vzGNyT240FqRgRykgQJ4S1bpl7YIzNI4YkiWPaqZ5nlVd+++zUYHDCuAp X5buTfATA2mxlMH6UZDdxx4qJEm3Hyze5QetNZEBZrLn8nzZ5GpgksHJ+RSirv/30MYg YJFT0CEUtqWSQnL6TLfDV5RuNivugQ+6nGiWKuocehpiX53HpDo6YsWNZiVv3cp7Z7H0 TneqTQ14QDVEg+DaPyfmuToc5lJvUwV1p/IP5oyEOabtbBaye0H+3VCXcB8LgkHBeQdG fpAo0tFTBtA/ePAS53A2Qu4D6osz1GzNxkGn2lAMLjDSE9zVPXtY8s49yRIcir7SdXKq wN+w== X-Gm-Message-State: AC+VfDyMR21SKAVjsUuDcEIXaEGrHIubc0YY1zr3sUEJVa2pXPRoC/Eo HxxMRqFTYd98uKQiCQvcv+T2CB5bHaez83JDCxE= X-Google-Smtp-Source: ACHHUZ68dM7tnk6yHm0l7UkLcSLZlFVV2H+f8FeAe5E7fzhvzyLxyee6+LUtqObiiOmYfML9tdlQda/vS3rRPIEL4Sk= X-Received: by 2002:a05:6214:20aa:b0:5e9:2bad:c8fa with SMTP id 10-20020a05621420aa00b005e92badc8famr36157217qvd.33.1687897223916; Tue, 27 Jun 2023 13:20:23 -0700 (PDT) MIME-Version: 1.0 References: <013101d9a91b$eb84cb60$c28e6220$@nextmovesoftware.com> In-Reply-To: <013101d9a91b$eb84cb60$c28e6220$@nextmovesoftware.com> From: Uros Bizjak Date: Tue, 27 Jun 2023 22:20:12 +0200 Message-ID: Subject: Re: [x86 PATCH] Add cbranchti4 pattern to i386.md (for -m32 compare_by_pieces). To: Roger Sayle Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=0.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,MEDICAL_SUBJECT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Jun 27, 2023 at 7:22=E2=80=AFPM Roger Sayle wrote: > > > This patch fixes some very odd (unanticipated) code generation by > compare_by_pieces with -m32 -mavx, since the recent addition of the > cbranchoi4 pattern. The issue is that cbranchoi4 is available with > TARGET_AVX, but cbranchti4 is currently conditional on TARGET_64BIT > which results in the odd behaviour (thanks to OPTAB_WIDEN) that with > -m32 -mavx, compare_by_pieces ends up (inefficiently) widening 128-bit > comparisons to 256-bits before performing PTEST. > > This patch fixes this by providing a cbranchti4 pattern that's available > with either TARGET_64BIT or TARGET_SSE4_1. > > For the test case below (again from PR 104610): > > int foo(char *a) > { > static const char t[] =3D "0123456789012345678901234567890"; > return __builtin_memcmp(a, &t[0], sizeof(t)) =3D=3D 0; > } > > GCC with -m32 -O2 -mavx currently produces the bonkers: > > foo: pushl %ebp > movl %esp, %ebp > andl $-32, %esp > subl $64, %esp > movl 8(%ebp), %eax > vmovdqa .LC0, %xmm4 > movl $0, 48(%esp) > vmovdqu (%eax), %xmm2 > movl $0, 52(%esp) > movl $0, 56(%esp) > movl $0, 60(%esp) > movl $0, 16(%esp) > movl $0, 20(%esp) > movl $0, 24(%esp) > movl $0, 28(%esp) > vmovdqa %xmm2, 32(%esp) > vmovdqa %xmm4, (%esp) > vmovdqa (%esp), %ymm5 > vpxor 32(%esp), %ymm5, %ymm0 > vptest %ymm0, %ymm0 > jne .L2 > vmovdqu 16(%eax), %xmm7 > movl $0, 48(%esp) > movl $0, 52(%esp) > vmovdqa %xmm7, 32(%esp) > vmovdqa .LC1, %xmm7 > movl $0, 56(%esp) > movl $0, 60(%esp) > movl $0, 16(%esp) > movl $0, 20(%esp) > movl $0, 24(%esp) > movl $0, 28(%esp) > vmovdqa %xmm7, (%esp) > vmovdqa (%esp), %ymm1 > vpxor 32(%esp), %ymm1, %ymm0 > vptest %ymm0, %ymm0 > je .L6 > .L2: movl $1, %eax > xorl $1, %eax > vzeroupper > leave > ret > .L6: xorl %eax, %eax > xorl $1, %eax > vzeroupper > leave > ret > > with this patch, we now generate the (slightly) more sensible: > > foo: vmovdqa .LC0, %xmm0 > movl 4(%esp), %eax > vpxor (%eax), %xmm0, %xmm0 > vptest %xmm0, %xmm0 > jne .L2 > vmovdqa .LC1, %xmm0 > vpxor 16(%eax), %xmm0, %xmm0 > vptest %xmm0, %xmm0 > je .L5 > .L2: movl $1, %eax > xorl $1, %eax > ret > .L5: xorl %eax, %eax > xorl $1, %eax > ret > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=3Dunix{-m32} > with no new failures. Ok for mainline? > > > 2023-06-27 Roger Sayle > > gcc/ChangeLog > * config/i386/i386-expand.cc (ix86_expand_branch): Also use ptest > for TImode comparisons on 32-bit architectures. > * config/i386/i386.md (cbranch4): Change from SDWIM to > SWIM1248x to exclude/avoid TImode being conditional on -m64. > (cbranchti4): New define_expand for TImode on both TARGET_64BIT > and/or with TARGET_SSE4_1. > * config/i386/predicates.md (ix86_timode_comparison_operator): > New predicate that depends upon TARGET_64BIT. > (ix86_timode_comparison_operand): Likewise. > > gcc/testsuite/ChangeLog > * gcc.target/i386/pieces-memcmp-2.c: New test case. OK with a small fix. Thanks, Uros. +;; Return true if this is a valid second operand for a TImode comparison. +(define_predicate "ix86_timode_comparison_operand" + (if_then_else (match_test "TARGET_64BIT") + (match_operand 0 "x86_64_general_operand") + (match_operand 0 "nonimmediate_operand"))) + + Please remove the duplicate blank line above.