From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qv1-xf34.google.com (mail-qv1-xf34.google.com [IPv6:2607:f8b0:4864:20::f34]) by sourceware.org (Postfix) with ESMTPS id CA28F3858287 for ; Mon, 8 Aug 2022 07:48:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CA28F3858287 Received: by mail-qv1-xf34.google.com with SMTP id i4so5836400qvv.7 for ; Mon, 08 Aug 2022 00:48:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=H+i8qao7NNn9P63kg891tCPx+ejfoM666+d4zyq4Igc=; b=NdE+OL+iBx/PthghO3hPF9JDqBWKNhIfPFkHDFvvpWbLs/p41DJjkwjoF9VSNph6rb RETVEXQ95IzxrBB+DCf+OIcvaRSjEBsmJqml6Tji3N4vFy3GuGN4T5Vp4D/Zd1dPdmSj ggkll6Dh05hHwqj42DHoA/Go3gYSJyXyf9WWADnViQR4s1g41baC6o1dZQ61qXc2hTDA mSm6Q4bQxK1y5IzHtUG+9iCDLgo6dHkXq3bZhA2Yiho56os8D/i1ccx2oNCiQhkwB5oi J5pPgK9QNt6wyW0DdW801UaYazOdn9z67ijMrJjZsz62n11DIMTlxJdPvNQPI24gzXgF A5/A== X-Gm-Message-State: ACgBeo0NpI+O3mCDxBDYcpk4dXBszKT++w2fetdAT0bpOJuQKXdB35N8 V/jRtxLktEI+zas4hS5OD+mLzNRfONlgdcic1szrHf0FnqRJaw== X-Google-Smtp-Source: AA6agR6fEwIADgcdVmWr3qAtG0DXUAjBK7k4mkrxlKmRYVBEwMlAcGURxJ0hmi4xjBeNmrKqGJulMeEeCMov79POpIU= X-Received: by 2002:a0c:a95d:0:b0:474:6f43:7ed8 with SMTP id z29-20020a0ca95d000000b004746f437ed8mr14663811qva.31.1659944917143; Mon, 08 Aug 2022 00:48:37 -0700 (PDT) MIME-Version: 1.0 References: <037701d8a8fa$4f65ed80$ee31c880$@nextmovesoftware.com> In-Reply-To: <037701d8a8fa$4f65ed80$ee31c880$@nextmovesoftware.com> From: Uros Bizjak Date: Mon, 8 Aug 2022 09:48:26 +0200 Message-ID: Subject: Re: [x86 PATCH] Move V1TI shift/rotate lowering from expand to pre-reload split. To: Roger Sayle Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2022 07:48:40 -0000 On Fri, Aug 5, 2022 at 8:36 PM Roger Sayle wrote: > > > This patch moves the lowering of 128-bit V1TImode shifts and rotations by > constant bit counts to sequences of SSE operations from the RTL expansion > pass to the pre-reload split pass. Postponing this splitting of shifts > and rotates enables (will enable) the TImode equivalents of these > operations/ > instructions to be considered as candidates by the (TImode) STV pass. > Technically, this patch changes the existing expanders to continue to > lower shifts by variable amounts, but constant operands become RTL > instructions, specified by define_insn_and_split that are triggered by > x86_pre_reload_split. The one minor complication is that logical shifts > by multiples of eight, don't get split, but are handled by existing insn > patterns, such as sse2_ashlv1ti3 and sse2_lshrv1ti3. There should be no > changes in generated code with this patch, which just adjusts the pass > in which transformations get applied. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32}, with > no new failures. Ok for mainline? > > > > 2022-08-05 Roger Sayle > > gcc/ChangeLog > * config/i386/sse.md (ashlv1ti3): Delay lowering of logical left > shifts by constant bit counts. > (*ashlvti3_internal): New define_insn_and_split that lowers > logical left shifts by constant bit counts, that aren't multiples > of 8, before reload. > (lshrv1ti3): Delay lowering of logical right shifts by constant. > (*lshrv1ti3_internal): New define_insn_and_split that lowers > logical right shifts by constant bit counts, that aren't multiples > of 8, before reload. > (ashrv1ti3):: Delay lowering of arithmetic right shifts by > constant bit counts. > (*ashrv1ti3_internal): New define_insn_and_split that lowers > arithmetic right shifts by constant bit counts before reload. > (rotlv1ti3): Delay lowering of rotate left by constant. > (*rotlv1ti3_internal): New define_insn_and_split that lowers > rotate left by constant bits counts before reload. > (rotrv1ti3): Delay lowering of rotate right by constant. > (*rotrv1ti3_internal): New define_insn_and_split that lowers > rotate right by constant bits counts before reload. +(define_insn_and_split "*ashlv1ti3_internal" + [(set (match_operand:V1TI 0 "register_operand") (ashift:V1TI (match_operand:V1TI 1 "register_operand") - (match_operand:QI 2 "general_operand")))] - "TARGET_SSE2 && TARGET_64BIT" + (match_operand:SI 2 "const_0_to_255_operand")))] + "TARGET_SSE2 + && TARGET_64BIT + && (INTVAL (operands[2]) & 7) != 0 Please introduce const_0_to_255_not_mul_8_operand predicate. Alternatively, and preferably, you can use pattern shadowing, where the preceding, more constrained pattern will match before the following, more broad pattern will. Uros.