From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by sourceware.org (Postfix) with ESMTPS id 1F4663858D20 for ; Mon, 30 Oct 2023 16:36:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1F4663858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1F4663858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::629 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698683777; cv=none; b=vMI4qvjLGTOND+WBaTaGLMu8bmTrwVKBzruuiGH79liyks4Q8tJB3SXORpY3Wk9BePSnG4AdIfNRgI3bAb77309+sm/RRF1n0/X8Oe++DwM7TFGMI5KqJzlG8aFKD0LLHODBfozgO6cRxoYf4jmowQem3gLHpNK7EKLsuJnvwCU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698683777; c=relaxed/simple; bh=m6MAB8cE870n+5dva+ncw39wGS9GpTr6iaCETXfau+U=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=TafID+Mkw7LzT3myyOhKFHFZ2MBKL/ME3W7o7AKXYuFBxy8FvrsXngTAhycHf6QhyUE8iMXeTiV7176csvLQKQoYHlv04xj3HFvrKsQ8lHzxh1+sWnxQp9ayCCd86ypvRgpNKoPKgGxIB4hq9ASNbzy0WOLTcMUqDcDcB3222Pg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1cc394f4cdfso10080365ad.0 for ; Mon, 30 Oct 2023 09:36:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698683775; x=1699288575; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=IYcLxEQQ1hrFrxD/QDWCGv5QH0Qc7ibspzUb2rWo1z0=; b=a4VeTrVjxX2KwLR4Pc8iB0YsYSgTipEq5gPGN494qrdK2tFEv3y3m8EKAqFd12+hmG zDUtEg6qvfbh1KniJOVJNSQ+JwN/S7EVmEb8MIHZQUOSpZln4JdiMDIlIuK2K2Qkmgb4 foLxgMedHu9y9hfiDfO6f8TGFy7OfkMkbGn+E2I1RtkVQaWgPT3yDClne8BDt9Ehpipu QmdksFwk4sIeA64+Ro+qg0j8G5PNYmFXEPv3C9yZtONCHH8HkgfACZhHWLJPjyRKRriu 7qUhxvGU/4JjoydXGcPkuZNtvsZrXluD8+FO7FT/ojNwG+AFv/11xXfxaum62u+NpKW0 uRNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698683775; x=1699288575; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IYcLxEQQ1hrFrxD/QDWCGv5QH0Qc7ibspzUb2rWo1z0=; b=d+bD01KLXsWPMp9FTkGgVCbNEqDLW+JlOKEe6Nvs7vReKzTn+zMv61BP5cHpq8H2Bh oeR/8WqVQjHwj0mGNxx4fGNN3kEy/kmJVbccKkCd89+P90d+W/hU+aWlgzimNImzjitB NsmbgJadFhUYkffDXjptQyz+pm9ecnsxQ5FT0+ivIXTKsLcH8QxUmMju5Ihm9IO1G4GE Z2oNIpdolsBkBXpDC3J4g+SZ+6ezHggc5aSxkkfxrtqAQdPjyIvnb7MFsbbavOZ3XITU OiPQNImWQt4pBa2/fpnhID383qRPdW0tSazOZyH0CxeroLRkXWj+rU6j9j5lQ1NKjFX1 6JQg== X-Gm-Message-State: AOJu0Yzo9RVjrFeJna8kax/bXGq/fM6RnT8+tCxppIhY5KurVfYZEkT/ hIEzLdgzUbHbVo2qFGwFMUIRUdrcHcQ= X-Google-Smtp-Source: AGHT+IHuG650BpaDMeHgSl/59MvwuAuoRlZ04DBhiyW1m7tTKr8aH04R6QvDaw8OLprBFQxRR7tp1A== X-Received: by 2002:a17:903:22ce:b0:1cc:38df:845f with SMTP id y14-20020a17090322ce00b001cc38df845fmr4993254plg.8.1698683774763; Mon, 30 Oct 2023 09:36:14 -0700 (PDT) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id a16-20020a170902ecd000b001c61901ed37sm6438440plh.191.2023.10.30.09.36.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 30 Oct 2023 09:36:14 -0700 (PDT) Message-ID: Date: Mon, 30 Oct 2023 10:36:09 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns Content-Language: en-US To: Fei Gao , gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, palmer@dabbelt.com References: <20231030072523.26818-1-gaofei@eswincomputing.com> <20231030072523.26818-3-gaofei@eswincomputing.com> From: Jeff Law In-Reply-To: <20231030072523.26818-3-gaofei@eswincomputing.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/30/23 01:25, Fei Gao wrote: > Conditional add, if zero > rd = (rc == 0) ? (rs1 + rs2) : rs1 > --> > czero.nez rd, rs2, rc > add rd, rs1, rd > > Conditional add, if non-zero > rd = (rc != 0) ? (rs1 + rs2) : rs1 > --> > czero.eqz rd, rs2, rc > add rd, rs1, rd > > Co-authored-by: Xiao Zeng > > gcc/ChangeLog: > > * ifcvt.cc (noce_emit_czero): helper for noce_try_cond_zero_arith > (noce_try_cond_zero_arith): handler for condtional zero op > (noce_process_if_block): add noce_try_cond_zero_arith with hook control > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/zicond_ifcvt_opt.c: New test. So the idea here is to improve upon the current code we generate for conditional arithmetic. Right now we support conditional arithmetic using zicond, but the sequence is poor. Basically the if-converter knows how to generate a conditional add, but it does so in a way that isn't as efficient as it could be. In effect ifcvt wants to generate t = a + b res = cond ? t : b We want to change it to t = cond ? b : 0; res = a + t; The latter sequence expands to more efficient code trivially for risc-v. I wandered a bit through the combine dumps to see if it would be easy to capture this class of cases. We never get anything useful, and while I can imagine "bridge" patterns that would potentially expose enough RTL to allow us to rewrite without changing ifcvt, it'd just be a hack IMHO. So going back to ifcvt... In the first sequence the addition must wait for both "a" and "b" to be available and the conditional move can fire on the next cycle. In the second sequence the conditional move can fire when just "b" is available. So that gives "a" another cycle to become ready (say if it's coming from memory or a multi-cycle operation like multiply). On the other hand the second sequence does keep "a" live longer. In the end I strongly suspect neither sequence is significantly better than the other. Meaning I don't think we need to conditionalize using condzero arith at all. I'll note that subsequent patches add MINUS, IOR, XOR and AND. It's also possible (and important) to handle shifts. There's a conditional shift-by-6 in leela's hot path. Overall this looks a lot like the VRULL code, but just less complete. My inclination is to do a cleanup pass on the VRULL code verify it handles all the cases in your tests and commit the VRULL implementation with your tests. I'll do some further poking at this today. Thanks for re-submitting these bits. Getting this target independent work cleaned up has been on my TODO for a while now. jeff