From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x230.google.com (mail-oi1-x230.google.com [IPv6:2607:f8b0:4864:20::230]) by sourceware.org (Postfix) with ESMTPS id DC0803857B81 for ; Sat, 11 Jun 2022 07:59:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DC0803857B81 Received: by mail-oi1-x230.google.com with SMTP id k11so1959496oia.12 for ; Sat, 11 Jun 2022 00:59:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0ehpeFhBOP1sA0k2u0pjeq0rT2oGdv3mqbJl7Z3J9/U=; b=zAeteKF9Znl4XrDTkMixbJjior12p3J8IOH35s1qm08VbQkpWAcZhzg8D1maMPXXFO LfWHBTFNCJCWWNZEaPjZru0crKtsB5Lvqf7Zm3pvK1uIgVC+8tzyt31WXa8HbrxPmh9X 8Nl+SH63U8F4jLS6za75iD55O6cs2N5Xlv7KX/yMbO+xCnfveNIUD3Y4JSWCPXQ6vpM5 MpQi5BgoQhjDdPcDhXz1mSmHSmJpEslDVgTFnrz2h9VfxnLjo3gXT4badAQ5APq12t9s AGMzgrg5wdv0XDi+F1BGjY7D4DT8h9XQVwIYpN4rgRnLKOn8VZRFcTzk54f/FlRdgaqo elqw== X-Gm-Message-State: AOAM530Y/B3OgPk7fWWKU8ot3fFDfH8ohjZv035Ip6s9QRya9LvK1MXv QsoYQrLKH/WT1Nm1LuZZZg3RMm/DSSvobQwvfiU= X-Google-Smtp-Source: ABdhPJz5zTA1OA7opSAw2mo9qUmVvXzJYAzPIcj40JEvQJx2v9wJO+Va1kH7jy6iBo6L+nSVlBbOPpxU78Ed3Oe+nss= X-Received: by 2002:aca:3203:0:b0:32e:b45e:131b with SMTP id y3-20020aca3203000000b0032eb45e131bmr1903469oiy.210.1654934346204; Sat, 11 Jun 2022 00:59:06 -0700 (PDT) MIME-Version: 1.0 References: <200341ff-4907-c6da-07cb-86b9b4588f84@yahoo.co.jp> In-Reply-To: <200341ff-4907-c6da-07cb-86b9b4588f84@yahoo.co.jp> From: Max Filippov Date: Sat, 11 Jun 2022 00:58:55 -0700 Message-ID: Subject: Re: [PATCH v2 4/4] xtensa: Improve constant synthesis for both integer and floating-point To: "Takayuki 'January June' Suwa" Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-0.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, FROM_LOCAL_NOVOWEL, HK_RANDOM_ENVFROM, HK_RANDOM_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, WEIRD_PORT autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Jun 2022 07:59:08 -0000 Hi Suwa-san, On Fri, Jun 10, 2022 at 8:28 AM Takayuki 'January June' Suwa wrote: > > This patch revises the previous implementation of constant synthesis. > > First, changed to use define_split machine description pattern and to run > after reload pass, in order not to interfere some optimizations such as > the loop invariant motion. > > Second, not only integer but floating-point is subject to processing. > > Third, several new synthesis patterns - when the constant cannot fit into > a "MOVI Ax, simm12" instruction, but: > > I. can be represented as a power of two minus one (eg. 32767, 65535 or > 0x7fffffffUL) > => "MOVI(.N) Ax, -1" + "SRLI Ax, Ax, 1 ... 31" (or "EXTUI") > II. is between -34816 and 34559 > => "MOVI(.N) Ax, -2048 ... 2047" + "ADDMI Ax, Ax, -32768 ... 32512" > III. (existing case) can fit into a signed 12-bit if the trailing zero bits > are stripped > => "MOVI(.N) Ax, -2048 ... 2047" + "SLLI Ax, Ax, 1 ... 31" > > The above sequences consist of 5 or 6 bytes and have latency of 2 clock > cycles, > in contrast with "L32R Ax, " (3 bytes and one clock latency, > but may > suffer additional one clock pipeline stall and implementation-specific > InstRAM/ROM access penalty) plus 4 bytes of constant value. > > In addition, 3-instructions synthesis patterns (8 or 9 bytes, 3 clock > latency) > are also provided when optimizing for speed and L32R instruction has > considerable access penalty: > > IV. 2-instructions synthesis (any of I ... III) followed by > "SLLI Ax, Ax, 1 ... 31" > V. 2-instructions synthesis followed by either "ADDX[248] Ax, Ax, Ax" > or "SUBX8 Ax, Ax, Ax" (multiplying by 3, 5, 7 or 9) > > gcc/ChangeLog: > > * config/xtensa/xtensa-protos.h (xtensa_constantsynth): > New prototype. > * config/xtensa/xtensa.cc (xtensa_emit_constantsynth, > xtensa_constantsynth_2insn, xtensa_constantsynth_rtx_SLLI, > xtensa_constantsynth_rtx_ADDSUBX, xtensa_constantsynth): > New backend functions that process the abovementioned logic. > (xtensa_emit_move_sequence): Revert the previous changes. > * config/xtensa/xtensa.md: New split patterns for integer > and floating-point, as the frontend part. > > gcc/testsuite/ChangeLog: > > * gcc.target/xtensa/constsynth_2insns.c: New. > * gcc.target/xtensa/constsynth_3insns.c: Ditto. > * gcc.target/xtensa/constsynth_double.c: Ditto. > --- > gcc/config/xtensa/xtensa-protos.h | 1 + > gcc/config/xtensa/xtensa.cc | 133 +++++++++++++++--- > gcc/config/xtensa/xtensa.md | 50 +++++++ > .../gcc.target/xtensa/constsynth_2insns.c | 44 ++++++ > .../gcc.target/xtensa/constsynth_3insns.c | 24 ++++ > .../gcc.target/xtensa/constsynth_double.c | 11 ++ > 6 files changed, 247 insertions(+), 16 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/xtensa/constsynth_2insns.c > create mode 100644 gcc/testsuite/gcc.target/xtensa/constsynth_3insns.c > create mode 100644 gcc/testsuite/gcc.target/xtensa/constsynth_double.c this change results in a bunch of ICEs in the tests like this: during RTL pass: split2 gcc/gcc/testsuite/gcc.c-torture/compile/20120727-1.c: In function 'f': gcc/gcc/testsuite/gcc.c-torture/compile/20120727-1.c:13:1: internal compiler error: in gen_split_5, at config/xtensa/xtensa.md:1186 0x7b6fdb gen_split_5(rtx_insn*, rtx_def**) gcc/gcc/config/xtensa/xtensa.md:1186 0xa8f927 try_split(rtx_def*, rtx_insn*, int) gcc/gcc/emit-rtl.cc:3795 0xde5fe9 split_insn gcc/gcc/recog.cc:3384 0xdecde7 split_all_insns() gcc/gcc/recog.cc:3488 0xdecea8 execute gcc/gcc/recog.cc:4406 -- Thanks. -- Max