From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x230.google.com (mail-lj1-x230.google.com [IPv6:2a00:1450:4864:20::230]) by sourceware.org (Postfix) with ESMTPS id E69463839D8F for ; Fri, 18 Nov 2022 09:15:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E69463839D8F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-lj1-x230.google.com with SMTP id l8so5985676ljh.13 for ; Fri, 18 Nov 2022 01:15:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=NSGC/ZPswq+p8tUZzjSD1d6caXHBmw7vJvGuCWudX3k=; b=Urt1ASwvm3tMzZmUs1Vo4BegW/gXD5tJHimJZuSrBmMB/vMJ6Q1GYA1WXy6fmYCV1V L0V2xpFdqIGymbfZ0yk48BAwqDoL+iuN+uZBm63wyHA0f3ZpxLXihwGEmedGITQihU0a Bp4c2nRmpI+r39q5Qu5f3KQix/zjHdTqg0f+Nc1wdjJo2U1LZOBkMuRAjRWdIJ2crduQ woXgf8K1KZSoLaieuLdAZdqhB2snrx4xw322GIos03cdkVnbNFDK1Y6+VTblRBAQ/bok KNReg9PXgo+4iopJTrcBaaVtLdCPAK4YhrvFAxH1qBPxtODHqPHIwtKU5mzjRlAFAxe0 YpLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NSGC/ZPswq+p8tUZzjSD1d6caXHBmw7vJvGuCWudX3k=; b=b1kwpM1iRRWx4wIrMR+Rm297mLz+vyKs5HDAfhtcALRdwR0il0rBpgZyL1BKtUHDcn cmE3t65X/n0xs4m+liQWOz/i4MoVU5F+c1k+GTvFcsrTwm7qF/b8xtWNHByjCczrfVqj 4GCmlXAKTJIvNGyTU3uyMadwr7U3t16+bl8s53yCJStW1f9KUX7LyI9B3XmPr5KBz4GV rh36oAGnEmHEt8yFliYubqLzedMKxKRd5T/rOeRVIn8fzccTOlJ0PKPHOiTJEwdYYDko fa8VbOY0dgJzporrrqcxI5pdqTksYdYrlQvudprZwGt4NVp4wNYNvI2Hk66garKgK+yG TB9A== X-Gm-Message-State: ANoB5plKRvrAa5gbh93qp8Kak7ui1e38nFqaOeHQPBbBRQLVruPGpF44 dytENRMdGgswq8pz0DCZfY5tkRwNwfUOPpPgnoT2fA== X-Google-Smtp-Source: AA0mqf46wpl3o5HgbECzl5awcLqIjnN6FxVYYr+hnM3ipvRjqVDRzejrllZnJIrDjLy2X3TfLaDDJpNwCSrlUcuP4mM= X-Received: by 2002:a2e:9f47:0:b0:277:1273:f3b8 with SMTP id v7-20020a2e9f47000000b002771273f3b8mr2071109ljk.178.1668762902397; Fri, 18 Nov 2022 01:15:02 -0800 (PST) MIME-Version: 1.0 References: <41d536f8-0f5d-d7fa-1b9d-37d88ade6640@gmail.com> In-Reply-To: From: Philipp Tomsich Date: Fri, 18 Nov 2022 10:14:51 +0100 Message-ID: Subject: Re: [PATCH] RISC-V: branch-(not)equals-zero compares against $zero To: Palmer Dabbelt Cc: jeffreyalaw@gmail.com, gcc-patches@gcc.gnu.org, Vineet Gupta , Kito Cheng , jlaw@ventanamicro.com, christoph.muellner@vrull.eu Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 18 Nov 2022 at 05:53, Palmer Dabbelt wrote: > > On Thu, 17 Nov 2022 14:44:31 PST (-0800), jeffreyalaw@gmail.com wrote: > > > > On 11/8/22 12:55, Philipp Tomsich wrote: > >> If we are testing a register or a paradoxical subreg (i.e. anything that is not > >> a partial subreg) for equality/non-equality with zero, we can generate a branch > >> that compares against $zero. This will work for QI, HI, SI and DImode, so we > >> enable this for ANYI. > >> > >> 2020-08-30 gcc/ChangeLog: > >> > >> * config/riscv/riscv.md (*branch_equals_zero): Added pattern. > > > > I've gone back an forth on this a few times. As you know, I hate > > subregs in the target descriptions and I guess I need to extend that to > > querying if something is a subreg or not rather than just subregs > > appearing in the RTL. > > > > > > Presumably the idea behind rejecting partial subregs is the bits outside > > the partial is unspecified, but that's also going to be true if we're > > looking at a hardreg in QImode (for example) irrespective of it being > > wrapped in a subreg. > > > > > > I don't doubt it works the vast majority of the time, but I haven't been > > able to convince myself it'll work all the time. How do we ensure that > > the bits outside the mode are zero? I've been bitten by this kind of > > problem before, and it's safe to say it was exceedingly painful to find. > > I don't really understand the middle-end issues here (if there are > any?), but I'm pretty sure code like this has passed by a few times > before and we've yet to find a reliable way to optimize these cases. > There's a bunch of patterns where knowing the XLEN-extension of shorter > values would let us generate better code, but there's also cases where > we'd generate worse code by ensure any extension scheme is followed. > > Every time I've seen this come up before I've managed to convince myself > we can't really fix the problem in the backend, though: if we always > generate extended values in registers then we just push the cost over to > the other patterns. The only way I've come up with to handle something > like this is to push more types into the middle-end so we can track > these high bits and generate the faster sequences where we know what > they are. That seems like a huge mess, though, and every time it comes > up folks run away ;) You are perfectly right that this problem can not be fixed in the backend, at least not in a general manner (i.e., additional patterns can resolve some of the cases and it is messy in the backend). In fact, we are looking at fixing this before/during lowering by avoiding the extension whenever possible (based on the type information and even value ranges). However, this work will miss GCC13 and that is the reason why the band-aid was submitted here. > Sorry if that's kind of vague, I usually find a way to break these but > my box isn't cooperating with GCC builds today so I haven't even gotten > that far yet... I am gathering the original rationale why this should be safe from our internal communication (the change is 2 years old, after all) and will follow up. If you find a way to break this in the meantime, please let us know. Philipp.