From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf2a.google.com (mail-qv1-xf2a.google.com [IPv6:2607:f8b0:4864:20::f2a]) by sourceware.org (Postfix) with ESMTPS id 2CF92385802B for ; Mon, 15 Aug 2022 09:07:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2CF92385802B Received: by mail-qv1-xf2a.google.com with SMTP id l18so4949849qvt.13 for ; Mon, 15 Aug 2022 02:07:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=MTJvL/dYQ6W8WxDqgRyw4aBAJBAeCJ81mt+Cb59kTyw=; b=m285gBI6lgEGiwPFff8J45EAfIMzBUb77WxyB01/am0aNynM/j1QZs8S9qkPJJNvG5 vp31SM7R2WkfO3kAL7dW2uy3opF6CuNLuwqraOMQs2F0QF2SthBA0t3vsmDXMiyPVyC9 bWW7VHfuxEPw2wGDTAUbLC1EHWpTNK+K1i7tdb5nPGZk+Tcdu8QtLjZnv3ksrIiO2Opg NhO70VqomlTqzbC47hOF3CocGdSQua7Qciv6B8RDkpy7kzvg6V+OfTgdPmr5rXwgccGC JIzRaHBl2tVzPfeFR5ZgKIR87BocNrZ/HCba4RyXiRcj8drfJtNU29DVRJNdohz5X9uS QDTA== X-Gm-Message-State: ACgBeo14u5TJRY2EmyR3+Y2ri3459LNVwK/30TY77dQ79TMWy0F/VUMq 59hDG3u0PcvJ6rZC38U2pDblTN4EyqY2jIIbusH44g5w9Eg= X-Google-Smtp-Source: AA6agR7TqfQhwRxw+t2SYjR5BzJ9hT96uDt982HtL7BNwsuhyB8Gh7hU4XhrM/JorlQ9mHogdN7X21LlybK5YOL53ak= X-Received: by 2002:a05:6214:c67:b0:476:e8f8:4f6 with SMTP id t7-20020a0562140c6700b00476e8f804f6mr12986080qvj.125.1660554420455; Mon, 15 Aug 2022 02:07:00 -0700 (PDT) MIME-Version: 1.0 References: <00f801d8b081$2f23e160$8d6ba420$@nextmovesoftware.com> In-Reply-To: <00f801d8b081$2f23e160$8d6ba420$@nextmovesoftware.com> From: Uros Bizjak Date: Mon, 15 Aug 2022 11:06:51 +0200 Message-ID: Subject: Re: [x86_64 PATCH] Support shifts and rotates by integer constants in TImode STV. To: Roger Sayle Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Aug 2022 09:07:03 -0000 On Mon, Aug 15, 2022 at 10:29 AM Roger Sayle wrote: > > > Many thanks to Uros for reviewing/approving all of the previous pieces. > This patch adds support for converting 128-bit TImode shifts and rotates > to SSE equivalents using V1TImode during the TImode STV pass. > Previously, only logical shifts by multiples of 8 were handled > (from my patch earlier this month). > > As an example of the benefits, the following rotate by 32-bits: > > unsigned __int128 a, b; > void rot32() { a = (b >> 32) | (b << 96); } > > when compiled on x86_64 with -O2 previously generated: > > movq b(%rip), %rax > movq b+8(%rip), %rdx > movq %rax, %rcx > shrdq $32, %rdx, %rax > shrdq $32, %rcx, %rdx > movq %rax, a(%rip) > movq %rdx, a+8(%rip) > ret > > with this patch, now generates: > > movdqa b(%rip), %xmm0 > pshufd $57, %xmm0, %xmm0 > movaps %xmm0, a(%rip) > ret > > [which uses a V4SI permutation for those that don't read SSE]. > This should help 128-bit cryptography codes, that interleave XORs > with rotations (but that don't use additions or subtractions). > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32}, > with no new failures. Ok for mainline? > > > 2022-08-15 Roger Sayle > > gcc/ChangeLog > * config/i386/i386-features.cc > (timode_scalar_chain::compute_convert_gain): Provide costs for > shifts and rotates. Provide gains for comparisons against 0/-1. Please split out the compare part, it doesn't fit under "Support shifts and rotates by integer constants in TImode STV." summary. > (timode_scalar_chain::convert_insn): Handle ASHIFTRT, ROTATERT > and ROTATE just like existing ASHIFT and LSHIFTRT cases. > (timode_scalar_to_vector_candidate_p): Handle all shifts and > rotates by integer constants between 0 and 127. > > gcc/testsuite/ChangeLog > * gcc.target/i386/sse4_1-stv-9.c: New test case. OK for the patch without COMPARE stuff, the separate COMPARE patch is pre-approved. Thanks, Uros.