From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) by sourceware.org (Postfix) with ESMTPS id 76E483858D39 for ; Wed, 19 Oct 2022 13:30:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 76E483858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-io1-xd2c.google.com with SMTP id p16so14423064iod.6 for ; Wed, 19 Oct 2022 06:30:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=L6udTYx3svVSg0Iv79RlfMtQjSlwpFuyyt6QV+d3YtU=; b=X95U8V0KW52NPZidKssiuv8HHdnD5ynBl13Y/9ShQh/66kAIjg/WroMJKn0UgDeLCW KXuawN+ZlOffMkY0pFKUrbN4CSi34MmsIDsfbgKrverAxEprIYQ6uJSjEeAHSsSnR2OC q8sv5onEvr91yNjUa/97VTIvBVHE+BVu06SdFslSwoX7yloxyE29/e553ev2hmpIP2xN 4o6FevuRrswIny7lNEyEe0m/uCIc8dkOMLSZD7CEzF8bieO+oZi4mfDnQc2Z5LJbKK62 +GO4SzDBpbNg+81h9J0Q0wuYb+I7RtMcGxdyuckpVX+hGEAd3xnJflD1IRvJvQq/mBec hVnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=L6udTYx3svVSg0Iv79RlfMtQjSlwpFuyyt6QV+d3YtU=; b=Bj+i6czHM49avcZNkxBYcf++pgcaRnhbjhA1o5mEtZUrw/yfwFtV5tCO2r/X8ewJRC HKCNdK3/N2fxsrb7vbKiO2v7+n4vQbvJnWl6Thhc9pHz2A+qDfJCuKt0hvTSjMiGxBke VAjvzWnX51sju/l17aJvIr8SSky7HAHmVB1eLrcYhyTikweTfQ1sivd3oBZKg2rqx8df TZtn1DCw84je8JpG4RNUBEGUFeE73WN+YsRYymKGmCWfUEj9hE5nHOmPStJyr1jKKnMb 2qBL03QeRS/3XuvdrBMlPn4CyGPeFjUkJPXztAvBMAuFLmU6vmKJI64sYhy5ZtqpLoQQ VYCg== X-Gm-Message-State: ACrzQf0rygzqHx3z2tgJFI6QvS+nwhU68wTd5jk0VOFObSpuGKa06vbh YwP4c6FDunsSqi+XD87Ii0w= X-Google-Smtp-Source: AMsMyM49NNb69zK6uQKoJqDKrM+x4FqxUluGajPWUcmdHTUujqmRodrEvoIs2EJDWbMk0fLBj5TxEw== X-Received: by 2002:a5d:924b:0:b0:6a4:c19d:c5b3 with SMTP id e11-20020a5d924b000000b006a4c19dc5b3mr5666782iol.147.1666186245499; Wed, 19 Oct 2022 06:30:45 -0700 (PDT) Received: from [172.31.1.18] (65-130-77-9.slkc.qwest.net. [65.130.77.9]) by smtp.gmail.com with ESMTPSA id c9-20020a92dc89000000b002f584a19a79sm2051544iln.34.2022.10.19.06.30.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 19 Oct 2022 06:30:45 -0700 (PDT) Message-ID: Date: Wed, 19 Oct 2022 07:30:44 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1 Subject: Re: Redundant constants in coremark crc8 for RISCV/aarch64 (no-if-conversion) Content-Language: en-US To: Richard Biener Cc: Vineet Gupta , gcc@gcc.gnu.org, Kito Cheng , Philipp Tomsich References: <1a636f1e-31be-1735-5d8f-649df3c5e018@gmail.com> <1e118c0c-5d9a-4fca-9fe9-12e2baa34019@rivosinc.com> <53dcbef4-7aef-5f63-9bd8-e11c614b0be8@gmail.com> <8cbea421-5130-6d37-06a2-42ec7daef5cc@gmail.com> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/19/22 01:46, Richard Biener wrote: > On Wed, Oct 19, 2022 at 5:44 AM Jeff Law via Gcc wrote: >> >> On 10/18/22 20:09, Vineet Gupta wrote: >>> On 10/18/22 16:36, Jeff Law wrote: >>>>>> There isn't a great place in GCC to handle this right now. If the >>>>>> constraints were relaxed in PRE, then we'd have a chance, but >>>>>> getting the cost model right is going to be tough. >>>>> It would have been better (for this specific case) if loop unrolling >>>>> was not being done so early. The tree pass cunroll is flattening it >>>>> out and leaving for rest of the all tree/rtl passes to pick up the >>>>> pieces and remove any redundancies, if at all. It obviously needs to >>>>> be early if we are injecting 7x more instructions, but seems like a >>>>> lot to unravel. >>>> Yup. If that loop gets unrolled, it's going to be a mess. It will >>>> almost certainly make this problem worse as each iteration is going >>>> to have a pair of constants loaded and no good way to remove them. >>> Thats the original problem that I started this thread with. I'd >>> snipped the disassembly as it would have been too much text but >>> basically on RV, Coremark crc8 loop of const 8 iterations gets >>> unrolled including extraneous 8 insns pairs to load the same constant >>> - which is preposterous. Other arches side-step by using if-conversion >>> / cond moves, latter currently WIP in RV International. x86 w/o >>> if-convert seems OK since the const can be encoded in the xor insn. >>> >>> OTOH given that gimple/tree-pass cunroll is doing the culprit loop >>> unrolling and introducing redundant const 8 times, can it ne addressed >>> there somehow. >>> tree_estimate_loop_size() seems to identify constant expression, not >>> just an operand. Can it be taught to identify a "non-trivial const" >>> and hoist/code-move the expression. Sorry just rambling here, most >>> likely non-sense. > On GIMPLE all constants are "simple". > >> Oh, cunroll. There might be a distinct flag for complete unrolling. > At -O3 we peel completely, there's no flag to disable that. > >> I really expect something like Click's work is the way forward. >> Essentially when you VN the function you'll identify those constants and >> collapse them all down to a single instance. Then the GCM phase will >> kick in and find a place to put the evaluation so that you have one and >> only one. > I'd say postreload gcse would be a place to do that. At least when > there's no available hardreg CSEing likely isn't going to be a win. That's an interesting idea.  Do it aggressively post-reload when we know there's a register available.   Vineet, that seems like it's worth investigation. jeff