From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by sourceware.org (Postfix) with ESMTPS id C8AC5384F6DB for ; Fri, 18 Nov 2022 04:53:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C8AC5384F6DB Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-pf1-x42f.google.com with SMTP id 130so3817926pfu.8 for ; Thu, 17 Nov 2022 20:53:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:to:from:cc :in-reply-to:subject:date:from:to:cc:subject:date:message-id :reply-to; bh=Rw3niJpoStGVuWdAKEKlkKuOUaDseljSp5dFGJPS3Q0=; b=jKsd+fk3UW3wGaY+H7vNlZ08DiySqyo2COWfPh73vMiXJEECWMeoLbEhx2+EaNFFTn P4yM4hfN/lQGtp4WgQErQyI8tCor5u8Wwi8/g3o+A8xJRwzE0oKsCWnfGB5x+UfXnobX NGzXLk/+0eNN842nc7CJsI5Th8GbuDD5NUUHK0DO+IFLzO8qjQALABMWa/n5lQ4pXPj0 eRJyKhNkRpnJXGFAp2qWikK5hnDlsnAKBYceABKreWGexlq1CcC1kfn5JwImarcCNJ3C 3bSknL7574PAZpWvS/NDImwpqmrLV7VI5KRVPZ0xpZ3GACbpdLFdtrSWPBXnf/sx5kQD AaiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:to:from:cc :in-reply-to:subject:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Rw3niJpoStGVuWdAKEKlkKuOUaDseljSp5dFGJPS3Q0=; b=svn3j0U5EvMWPWauSjVE7ut7pMmDaYM9qXazPN4lMtdoRpCtt41a1woXtOFBmH9L9s XDYuX7lND01J7e9sSXWYOU5zRhhC/zS0X2CJTynvOmAD9Ys1FgZrwt6QRnUnTzu4anih uIFxCNJbKMactQ+94u3+AQNuXZqXa8jSUhUFZcHPrS/10g2wQ5eWECetg0pOE/pTxCcq 9hXHxaYwT3I04JciRfMjWcupUX3//XJ5Q7xIPQwsCQPLR3+srxfT/3Ms4Qh4Ow9NE/Bw IOrbrWMgc9Xo+Q+AIEvMANBxADjCMSIxlPdLhnP7CdGJ0M3uRoQeSf4eeDyIBwFeganj 0liQ== X-Gm-Message-State: ANoB5pmQjPZQe52ioEQeRBo1FfKBFl3yAKelwWqwuLZ+0OY/FN7828Hs mG/zhUrAgnZEd8zs35frTQgobQ== X-Google-Smtp-Source: AA0mqf4kZfg4DSxpY6sNCFWHTc1lW/YPq0mpnrB9+IApyz83jL8uFuAJOz/AdQ8JlaEoegZwBKXWaw== X-Received: by 2002:aa7:96e6:0:b0:56d:9eed:61eb with SMTP id i6-20020aa796e6000000b0056d9eed61ebmr6216955pfq.4.1668747181686; Thu, 17 Nov 2022 20:53:01 -0800 (PST) Received: from localhost ([50.221.140.188]) by smtp.gmail.com with ESMTPSA id bh6-20020a17090b048600b00218847a356bsm505304pjb.9.2022.11.17.20.53.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Nov 2022 20:53:01 -0800 (PST) Date: Thu, 17 Nov 2022 20:53:01 -0800 (PST) X-Google-Original-Date: Thu, 17 Nov 2022 20:52:54 PST (-0800) Subject: Re: [PATCH] RISC-V: branch-(not)equals-zero compares against $zero In-Reply-To: <41d536f8-0f5d-d7fa-1b9d-37d88ade6640@gmail.com> CC: philipp.tomsich@vrull.eu, gcc-patches@gcc.gnu.org, Vineet Gupta , Kito Cheng , jlaw@ventanamicro.com, christoph.muellner@vrull.eu From: Palmer Dabbelt To: jeffreyalaw@gmail.com Message-ID: Mime-Version: 1.0 (MHng) Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 17 Nov 2022 14:44:31 PST (-0800), jeffreyalaw@gmail.com wrote: > > On 11/8/22 12:55, Philipp Tomsich wrote: >> If we are testing a register or a paradoxical subreg (i.e. anything that is not >> a partial subreg) for equality/non-equality with zero, we can generate a branch >> that compares against $zero. This will work for QI, HI, SI and DImode, so we >> enable this for ANYI. >> >> 2020-08-30 gcc/ChangeLog: >> >> * config/riscv/riscv.md (*branch_equals_zero): Added pattern. > > I've gone back an forth on this a few times.  As you know, I hate > subregs in the target descriptions and I guess I need to extend that to > querying if something is a subreg or not rather than just subregs > appearing in the RTL. > > > Presumably the idea behind rejecting partial subregs is the bits outside > the partial is unspecified, but that's also going to be true if we're > looking at a hardreg in QImode (for example) irrespective of it being > wrapped in a subreg. > > > I don't doubt it works the vast majority of the time, but I haven't been > able to convince myself it'll work all the time.  How do we ensure that > the bits outside the mode are zero?  I've been bitten by this kind of > problem before, and it's safe to say it was exceedingly painful to find. I don't really understand the middle-end issues here (if there are any?), but I'm pretty sure code like this has passed by a few times before and we've yet to find a reliable way to optimize these cases. There's a bunch of patterns where knowing the XLEN-extension of shorter values would let us generate better code, but there's also cases where we'd generate worse code by ensure any extension scheme is followed. Every time I've seen this come up before I've managed to convince myself we can't really fix the problem in the backend, though: if we always generate extended values in registers then we just push the cost over to the other patterns. The only way I've come up with to handle something like this is to push more types into the middle-end so we can track these high bits and generate the faster sequences where we know what they are. That seems like a huge mess, though, and every time it comes up folks run away ;) Sorry if that's kind of vague, I usually find a way to break these but my box isn't cooperating with GCC builds today so I haven't even gotten that far yet...