From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf30.google.com (mail-qv1-xf30.google.com [IPv6:2607:f8b0:4864:20::f30]) by sourceware.org (Postfix) with ESMTPS id 6DC633858418 for ; Sun, 5 Jun 2022 19:12:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6DC633858418 Received: by mail-qv1-xf30.google.com with SMTP id cv1so9050138qvb.5 for ; Sun, 05 Jun 2022 12:12:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JmVaCS74UUMdMdVZ/X7tNTC1FNIkH9CFKwuq+XhBeJI=; b=1Nd0W8oo3/cveStGituXQ0tlrQ7JO90LBWH2sY9w+HIPGTfHbGT1q0zugz7fGtQfvO OIROBfx0wJMakAxo4vC3Q5i2RR3ebp2ct4PVqA2A8Nn/Frhjj4O2DhMhmVvfq7p7wDaa z3wvGOooNXnGtT11/ZXYszGfZ2LszoQGyTCj9e3AZsCT02207VX8GgHrfRPPjUiu6YHR 3GunG7TueKJ/8DF4OJWCkNjQZBO0XJw9sJVFSFuZoKhorybEKW9SWNbF5D6ab1XYEtFr OTpQT1LC5XrKs5oXj7UAr00KcOUX/a3VO1uyvcq1+fQr0jFiYr/El2EzDdwHfppQK7Qj ahMw== X-Gm-Message-State: AOAM5307rl35aHazgS8TepG7LxXbSBFEK3SK2c3xDlAEAxFmmqtBRXCK whduAj0KZfEUQWpmGkJMUe4BcKMacOSGKbSQXFxZw9ieAhU= X-Google-Smtp-Source: ABdhPJwV1NCtIztxHDj3bipk4HC1A2B0pH9OBJbcG4bdPVNsyCafj6vjwX9ZQvoC5scLEUHxCMpsEF0rMPUUpcDK+Ns= X-Received: by 2002:ad4:5141:0:b0:464:5020:28b3 with SMTP id g1-20020ad45141000000b00464502028b3mr28469032qvq.31.1654456349742; Sun, 05 Jun 2022 12:12:29 -0700 (PDT) MIME-Version: 1.0 References: <016e01d87900$615a7a30$240f6e90$@nextmovesoftware.com> In-Reply-To: <016e01d87900$615a7a30$240f6e90$@nextmovesoftware.com> From: Uros Bizjak Date: Sun, 5 Jun 2022 21:12:18 +0200 Message-ID: Subject: Re: [x86 PATCH] Double word implementation of and; cmp to not; test optimization. To: Roger Sayle Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Jun 2022 19:12:32 -0000 On Sun, Jun 5, 2022 at 7:19 PM Roger Sayle wrote: > > > This patch extends the recent and;cmp to not;test optimization to also > perform this transformation for TImode on TARGET_64BIT and DImode on -m32, > One motivation for this is that it's a step to fixing the current failure > of gcc.target/i386/pr65105-5.c on -m32. > > A more direct benefit for x86_64 is that the following code: > > int foo(__int128 x, __int128 y) > { > return (x & y) == y; > } > > improves (with -O2 -mbmi) from: > > movq %rdi, %r8 > movq %rsi, %rdi > movq %rdx, %rsi > andq %rcx, %rdi > movq %r8, %rax > andq %rdx, %rax > movq %rdi, %rdx > xorq %rsi, %rax > xorq %rcx, %rdx > orq %rdx, %rax > sete %al > movzbl %al, %eax > ret > > to the much better: > > movq %rdi, %r8 > movq %rsi, %rdi > andn %rdx, %r8, %rax > andn %rcx, %rdi, %rsi > orq %rsi, %rax > sete %al > movzbl %al, %eax > ret > > The major theme of this patch is to generalize many of i386.md's > *di3_doubleword patterns to become *_doubleword patterns, i.e. > whenever there exists a "double word" optimization for DImode with -m32, > there should be an equivalent TImode optimization on TARGET_64BIT. No, please do not mix two different themes in one patch. OTOH, the only TImode optimization that can be used with SSE registers is with logic instructions and some constant shifts, but there is no TImode arithmetic. I assume your end goal is to introduce STV for TImode on 64-bit targets, because DImode patterns for x86_32 were introduced to avoid early decomposition by middle end and to split instructions that STV didn't convert to vector instructions after STV pass. So, let's start with basic V1TImode support before optimizations are introduced. Uros. > The following patch has been tested on x86_64-pc-linux-gnu with > make bootstrap and make -k check, where on TARGET_64BIT there are > no new failures, but paradoxically with --target_board=unix{-m32} > the other dg-final clause in gcc.target/i386/pr65105-5.c now fails. > Counter-intuitively, this is progress, and pr65105-5.c may now be > fixed (without using peephole2) simply by tweaking the STV pass to > handle andn/test (in a follow-up patch). > OK for mainline? > > > 2022-06-05 Roger Sayle > > gcc/ChangeLog > * config/i386/i386.cc (ix86_rtx_costs) : Provide costs > for double word comparisons and tests (comparisons against zero). > * config/i386/i386.md (*test_not_doubleword): Split DWI > and;cmp into andn;cmp $0 as a pre-reload splitter. > (define_expand and3): Generalize from SWIM1248x to SWIDWI. > (define_insn_and_split "*anddi3_doubleword"): Rename/generalize... > (define_insn_and_split "*and3_doubleword"): ... to this. > (define_insn "*andndi3_doubleword"): Rename and generalize... > (define_insn "*andn3_doubleword): ... to this. > (define_split): Split andn when TARGET_BMI for both modes. > (define_split): Split andn when !TARGET_BMI for both modes. > (define_expand 3): Generalize from SWIM1248x to > SWIDWI. > (define_insn_and_split "*3_doubleword): Generalize > from DI mode to both modes. > > gcc/testsuite/ChangeLog > * gcc.target/i386/testnot-3.c: New test case. > > > Thanks again, > Roger > -- >