From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x32c.google.com (mail-ot1-x32c.google.com [IPv6:2607:f8b0:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id BB4F43858C2D for ; Sun, 19 Nov 2023 18:52:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BB4F43858C2D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BB4F43858C2D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::32c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700419959; cv=none; b=hVE/FLuyqOolprCAVx2E7HivlW92MoTASg1eQq9izrB5/myZuaCpyX6KBmhSON1MCANpD0AbmnJEKogYki6XElPbA6WaL71QE/c0W1YCzbmEyAYnmAb1mriUpBsn8I2k5WbW47qGab9WA702AhzayQQxNEHTAupjU58Kb/7vxbs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700419959; c=relaxed/simple; bh=X08eXueEHJZpp2+xSG9pvQAbbKXmE0SWgFK7FbHlfZI=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=fGJ7T5oQGF96UcMFr1GmxYXfuMuzMXvqsmdHcPLol6IZIl5jhZHVpSSIioiIVcy7iNACx+m013tGfQqdtVSS6fE+MncutxjudC4iW+gr+NHHxC3II3c2/LEx9fGMg2JiUwgT4Gd1rVec8MVPhOGUz1gclssErQnlFmcNaOh6L+w= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-x32c.google.com with SMTP id 46e09a7af769-6ce2c71c61fso1973371a34.1 for ; Sun, 19 Nov 2023 10:52:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700419957; x=1701024757; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=oNrvatnGDBaL+sQ2Hl/K28e7sXtlfDCOrPPiaO7pnUQ=; b=km9gfsvqp0np6z45+oaTna6p+WiIkt5jrrptu7DMq4EuJWa97x1ojjh6N1hZNCBzlU 7eQMTqb14224Sh4fEzpwIrPxtUf448/u4ZlO0zQC/ypdGBqAg3HIeElA4dXJiVLr9uL0 aG/mk7dW/Gqb3ASAutCGH/xF1e1PXa7csEf8Ngf6osefNrTqiBMFpt8wYkzXMHynWdH0 5eP+HEQK74ZLAN1oMKcUQS0M5qbdHZWZMqzNG0ucZ1LEWFI2T/0j2P6uHis2kuswcI2L /VLijsughIO26ieHKeKQeYSmV5V3qxGkcLyRjfNh+od47oV2FF1VvAxEyMqNRM0aasDY /HFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700419957; x=1701024757; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=oNrvatnGDBaL+sQ2Hl/K28e7sXtlfDCOrPPiaO7pnUQ=; b=w4wDHQflusZIwawy0Z1NKrs4Oe13Nwstr9+QCIbZEJDWZYJnC9QPBsvwCBVFZUNZ7h i0IQO+apdbD9rhsoKrDq4n1rE/DG+gHMKPhn6ttdyNM5dfeuDCtJbiYudOz2otO+rX9h WCY6/ipwsCtydq9xFsxC+LEhgVcXGye+QSAkECDQt+Uez5+dBgK2u7wdEiO8JxZGPuwI 8RC8bBNIIcgr0QgHRy2uU1xl60LGUbDS4KL/GmzvaKXLu/X2Nz+9oyyBOFBqJowBOvVf klQ8VTLGDOYibzQla+63nsDlnqvXPoNbpDrajYLut6rm9bW/9EhHblXBuWEd/o3fwdbE fGFA== X-Gm-Message-State: AOJu0YzqgReo6HpSuyWXXB4XHBtgTKwoCSJ7wS2PE19BemaUe7N+fjTn h29L24CnFz6s1IGM9AO2wzs= X-Google-Smtp-Source: AGHT+IHlVHTk16qvD9lHvy3hi+HFbfOUuSdYwI6EKBAbLc/rVGHkshOzw1/gHnQvqU0v0FQngKqlWQ== X-Received: by 2002:a05:6808:992:b0:3b2:e34f:349b with SMTP id a18-20020a056808099200b003b2e34f349bmr4428529oic.42.1700419956993; Sun, 19 Nov 2023 10:52:36 -0800 (PST) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id dm11-20020a056820280b00b00587b37a1f86sm1128523oob.11.2023.11.19.10.52.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 19 Nov 2023 10:52:36 -0800 (PST) Message-ID: <7ec2ebde-9242-4907-85d9-d76e84bea5ec@gmail.com> Date: Sun, 19 Nov 2023 11:52:35 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 09/44] RISC-V: Rework branch costing model for if-conversion Content-Language: en-US To: "Maciej W. Rozycki" , gcc-patches@gcc.gnu.org Cc: Andrew Waterman , Jim Wilson , Kito Cheng , Palmer Dabbelt References: From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,URIBL_BLACK autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 11/18/23 22:36, Maciej W. Rozycki wrote: > The generic branch costing model for if-conversion assumes a fixed cost > of COSTS_N_INSNS (2) for a conditional branch, and that one half of that > cost comes from a preceding condition-set instruction, such as with > MODE_CC targets, and then the other half of that cost is for the actual > branch instruction. This is hardcoded for `if_info.original_cost' in > `noce_find_if_block' and regardless of the cost set for branches via > BRANCH_COST. > > Then `default_max_noce_ifcvt_seq_cost' instructs if-conversion to prefer > a branchless sequence as costly as high as triple the BRANCH_COST value > set. This is apparently to make up for the inability to accurately > guess the branch penalty. > > Consequently for the BRANCH_COST of 3 we commonly set for tuning, > if-conversion will consider branchless sequences costing 3 * 3 - 2 = 7 > instruction units more than a corresponding branch sequence. For the > BRANCH_COST of 4 such as with `sifive-7-series' tuning this is even > worse, at 3 * 4 - 2 = 10. Effectively it means a branchless sequence > will always be chosen if available, even a very inefficient one. > > Rework the branch costing model to better match our architecture, > observing in particular that we have no preparatory instructions for > branches so that the cost of a branch is naked BRANCH_COST plus any > extra overhead the processing of a branch's source RTX might incur. > > Provide TARGET_INSN_COST and TARGET_MAX_NOCE_IFCVT_SEQ_COST handlers > than that return suitable cost based on BRANCH_COST. The latter hook > usually returns a value that is lower than the cost of the corresponding > branched sequence. This is because we don't really want to produce a > branchless sequence that is more expensive than the original branched > sequence. If this turns out too conservative for some corner case, then > this choice might be revisited. > > Then we don't want to fiddle with `noce_find_if_block' without a lot of > cross-target verification, so add TARGET_NOCE_CONVERSION_PROFITABLE_P > defined such that it subtracts the fixed COSTS_N_INSNS (2) cost from the > cost of the original branched sequence supplied and instead adds actual > branch cost calculated from the conditional branch instruction used. It > is then further tweaked according to simple analysis of the replacement > branchless sequence produced so as to cancel the cost of an extraneous > zero extend operation produced by `noce_try_store_flag_mask' as observed > with gcc/testsuite/gcc.target/riscv/pr105314.c. > > Tweak the testsuite accordingly and set `-mbranch-cost=' explicitly for > the relevant cases so that the expected if-conversion transformation is > made regardless of the default BRANCH_COST value of tuning in effect. > Some of these settings will be lowered later on as deficiencies in > branchless sequence generation have been fixed that lower their cost > calculated by if-conversion. As I suspect you know a big part of the problem here is that BRANCH_COST and rtx_cost don't have any common scale and thus trying to compare BRANCH_COST to RTX_COST doesn't have well defined meaning. That hasn't kept us from trying to do precisely that and the result has always been less than satisfactory. You're introducing more, but I don't think there's a reasonable way out of this mess at this point. > > gcc/ > * config/riscv/riscv.cc (riscv_insn_cost): New function. > (riscv_max_noce_ifcvt_seq_cost): Likewise. > (riscv_noce_conversion_profitable_p): Likewise. > (TARGET_INSN_COST): New macro. > (TARGET_MAX_NOCE_IFCVT_SEQ_COST): New macro. > (TARGET_NOCE_CONVERSION_PROFITABLE_P: New macro. > > gcc/testsuite/ > * gcc.target/riscv/zicond-primitiveSemantics_compare_imm_return_imm_imm.c: > Explicitly set the branch cost. > * gcc.target/riscv/zicond-primitiveSemantics_compare_imm_return_imm_reg.c: > Likewise. > * gcc.target/riscv/zicond-primitiveSemantics_compare_imm_return_reg_reg.c: > Likewise. > * gcc.target/riscv/zicond-primitiveSemantics_compare_reg_return_imm_imm.c: > Likewise. > * gcc.target/riscv/zicond-primitiveSemantics_compare_reg_return_imm_reg.c: > Likewise. > * gcc.target/riscv/zicond-primitiveSemantics_compare_reg_return_reg_reg.c: > Likewise. > --- > FWIW I don't understand why the test cases absolutely HAD to have such > overlong names guaranteed to exceed our 80 column limit in any context. > It's such a pain to handle. I dislike the long names as well. I nearly changed them myself as part of the eswin submission, but that seemed a bit gratituous to me so I left them as-is. If you wanted to rename them, be my guest, consider it pre-approved ;-) WRT the extraneous zero-extension. Isn't that arguably a bug in the scc expander for risc-v? Fixing that isn't a prerequisite here, but it probably worth a bit of someone's time. OK for the trunk. jeff