From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) by sourceware.org (Postfix) with ESMTPS id E3E59385E83F for ; Wed, 16 Mar 2022 08:15:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E3E59385E83F Received: by mail-ej1-x62f.google.com with SMTP id pv16so2593415ejb.0 for ; Wed, 16 Mar 2022 01:15:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=X9uZB+p64THlKA2uvJ4G16LAzb9Y+VckRybXicQDO7Q=; b=gLK9dGsUZr8hlCk1uOBVHFHfeGQ7q0KHj/LjQTRdYKTrOq2XOKPnEVefTjti2sscsq UGnAAWLvUalfmd8kselCbUF0w2ibO5tukz3RWJ4H+Axsn9qwKAOD+dBW4QJrWt0U+VDh Y3nBVGEdWNyBVZUE9F5Mrtuo9Vx9fdlCEaxg4ULGplu67HOvaDazZRvvwi85Bc+4tx2Y hst82tU6o3HgukJcGU9yZWfEd0R9wGhHXgVOfsLKGZCHGYNSxpjtTiDpmwfihelFLPvy 6S01S5wd6cKJbV/nAgw+to7OF1EMgdL+U3uWjq9I/AN1t4QJSpWivAgmAc6BEgp4wZuu DpMQ== X-Gm-Message-State: AOAM530Kr46SBoSYv6vaoSVm7tVvE8pJ+abZw8ESEFq5Z97gZ+DhpEWc i6LoET4jNKYDYrPUv2rfdMyffObTJusQt8E2zoI= X-Google-Smtp-Source: ABdhPJzL26HN/gpKYZuGTYF59cT0sMvsBw8p2eej27ENDDOxecyk2XrHvEphPSQP/knWpCjfuXuB/zk7gLxTzuEugjU= X-Received: by 2002:a17:907:6e01:b0:6d0:562c:e389 with SMTP id sd1-20020a1709076e0100b006d0562ce389mr26111326ejc.497.1647418543467; Wed, 16 Mar 2022 01:15:43 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Wed, 16 Mar 2022 09:15:32 +0100 Message-ID: Subject: Re: RFA: crc builtin functions & optimizations To: Joern Rennecke Cc: GCC Patches , Jon Beniston Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Mar 2022 08:15:46 -0000 On Tue, Mar 15, 2022 at 4:15 PM Joern Rennecke wrote: > > On 15/03/2022, Richard Biener wrote: > > > Why's this a new pass? Every walk over all insns costs time. > > If should typically scan considerably less than all the insns. > > > The pass > > lacks any comments as to what CFG / stmt structure is matched. > > I've put a file in: > config/riscv/tree-crc-doc.txt > > would this text be suitabe to put in a comment block in tree-crc.cc ? Yes, that would be a better place I think. > > From > > a quick look it seems like it first(?) statically matches a stmt sequence > > without considering intermediate stmts, so matching should be quite > > fragile. > > It might be fragile inasmuch as it won't match when things change, but > the matching has remained effective for seven years and across two > architecture families with varying word sizes. > And with regards to matching only what it's supposed to match, I believe > I have checked all the data dependencies and phis so that it's definitely > calculating a CRC. > > > Why not match (sub-)expressions with the help of match.pd? > > Can you match a loop with match.pd ? No, this is why I said (sub-)expression. I'm mainly talking about tmp = (tmp >> 1) | (tmp << (sizeof (tmp) * (8 /*CHAR_BIT*/) - 1)); if ((long)tmp < 0) for example - one key bit of the CRC seems to be a comparison, so you'd match that with sth like (match (crc_compare @0) (lt (bit_ior (rshift @0 integer_onep) (lshift @0 INTEGER_CST@1)) integer_zerop) (if (compare_tree_int (@1, TYPE_SIZE_UNIT (TREE_TYPE (@0)) * 8 - 1)))) where you can add alternative expression forms. You can then use the generated match function from the pass. See for example the tree-ssa-forwprop.cc use of gimple_ctz_table_index. > > Any reason why you match CRC before early inlinig and thus even when > > not optimizing? Matching at least after early FRE/DCE/DSE would help > > to get rid of abstraction and/or memory temporary uses. > > I haven't originally placed it there, but I believe benefits include: > - Getting rid of loop without having to actively deleting it in the > crc pass (this also > might be safer as we just have to make sure we're are computing the CRC, and > DCE will determine if there is any ancillary result that is left, > and only delete the > loop if it's really dead. > - The optimized function is available for inlining. The canonical place to transform loops into builtins would be loop distribution. Another place would be final value replacement since you basically replace the reduction result with a call to the builtin, but I think loop-distribution is the better overall place. See how we match strlen() there. Richard.