From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by sourceware.org (Postfix) with ESMTPS id 3F99E3858D20 for ; Wed, 5 Apr 2023 16:18:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3F99E3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pj1-x102d.google.com with SMTP id d13so34508718pjh.0 for ; Wed, 05 Apr 2023 09:18:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680711481; x=1683303481; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=SOXmsjR0BznMmypycVZ5PqbeIvP71lHfiI5zp8tPYrU=; b=QyJ4raK73LuUROfXkc8I7XCkH+rZwOn8QDkJTw5wbxQ4vL7+SoKfktDnU6tq4It2n1 9NEI8cjjCQcstuglhiph2NuAVidSmk1REefKvPVqlJbMeV4wUKYXGbzy33Jved1B14vo 0MD3fhz0N9OCVGxcm+9qIGzyCGPZqxDkPDoap5FQMeyuTNhT9ME5u6sKnOEqQQnZQw0H at1D3VZYumXFCc156FbfTRGDm//xXRCXsrQbxu70NsJMh3UrczF8h8pR64fqcW0gNCNK iKjtUfDjQa9TuoHfhVaFHi6U3WcltG1/pK7H5yYkoDU53IgJwtyd//BsCNjkeGHpKeCy eDHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680711481; x=1683303481; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SOXmsjR0BznMmypycVZ5PqbeIvP71lHfiI5zp8tPYrU=; b=Y1WF6ZFix+BVU1CYg9U0THWHVbMbT1J81Fg/JLI8U64KfAVahPbJFX2LI2k14jEagL 39Vu+TnGAzOTfoH/iWPUHcqw3iLkokLPl9zCNNfl55/BnuUDj452G83pReJJkfMX8svh Zv3ACQ2xGem46dCbUrne4tCOzaRLsEFUQQ9HkW0cax4lNPtSmKGMXiV8/7ZgZvsUu579 ItG2UhfJC3p/7Pp+f26K2GLMN/fRF4FCyzQqmDbFmS2hjBUvRox0iL//uPh1giA9L9o1 9yQf7PSNd+hUKJpM9QYZOHcGUtjFSnzDOZvzxrfc8kybDRAUKmYRcjJUvZLvyFqxqUV9 bL9A== X-Gm-Message-State: AAQBX9eyiHPe11u8TmFTe8JbwyQHrWA0tduXo5QnhO2nbruEgBHDkm56 LeNN1SAoOK8I5/x8LrLzNMk= X-Google-Smtp-Source: AKy350aZ1ZZNlLkMRdpm5IIECGywZ5UfRgKstNwRcsGSpCY8F5V8f+Nwqhhflky3E4J8JKj9t0AJGA== X-Received: by 2002:a17:902:c948:b0:1a1:f70c:c81a with SMTP id i8-20020a170902c94800b001a1f70cc81amr8141951pla.9.1680711480867; Wed, 05 Apr 2023 09:18:00 -0700 (PDT) Received: from ?IPV6:2601:681:8600:13d0::f0a? ([2601:681:8600:13d0::f0a]) by smtp.gmail.com with ESMTPSA id f6-20020a17090a9b0600b0023f9782333fsm1597562pjp.13.2023.04.05.09.17.59 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 05 Apr 2023 09:18:00 -0700 (PDT) Message-ID: Date: Wed, 5 Apr 2023 10:17:59 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0 Subject: Re: [PATCH] dse: Handle SUBREGs of word REGs differently for WORD_REGISTER_OPERATIONS targets [PR109040] Content-Language: en-US To: Jakub Jelinek Cc: Richard Biener , Richard Sandiford , Eric Botcazou , gcc-patches@gcc.gnu.org References: <8e0e3cd5-e4db-ce8a-b7dc-baac32aed516@gmail.com> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 4/5/23 08:51, Jakub Jelinek wrote: > On Wed, Apr 05, 2023 at 07:14:23AM -0600, Jeff Law wrote: >>> The following testcase is miscompiled on riscv since the addition >>> of *mvconst_internal define_insn_and_split. >>> I believe the bug is in DSE. We have: >>> (insn 36 35 39 2 (set (mem/c:SI (plus:SI (reg/f:SI 65 frame) >>> (const_int -64 [0xffffffffffffffc0])) [2 S4 A128]) >>> (reg:SI 166)) "pr109040.c":9:11 178 {*movsi_internal} >>> (expr_list:REG_DEAD (reg:SI 166) >>> (nil))) >>> (insn 39 36 40 2 (set (reg:SI 171) >>> (zero_extend:SI (mem/c:HI (plus:SI (reg/f:SI 65 frame) >>> (const_int -64 [0xffffffffffffffc0])) [0 S2 A128]))) "pr109040.c":9:11 111 {*zero_extendhisi2} >>> (nil)) >>> and RTL DSE's replace_read since r0-86337-g18b526e806ab6455 handles >>> even different modes like in the above case, and so it optimizes it into: >>> (insn 47 35 39 2 (set (reg:HI 175) >>> (subreg:HI (reg:SI 166) 0)) "pr109040.c":9:11 179 {*movhi_internal} >>> (expr_list:REG_DEAD (reg:SI 166) >>> (nil))) >>> (insn 39 47 40 2 (set (reg:SI 171) >>> (zero_extend:SI (reg:HI 175))) "pr109040.c":9:11 111 {*zero_extendhisi2} >>> (expr_list:REG_DEAD (reg:HI 175) >>> (nil))) >>> Pseudo 166 is result of AND with 0x8084c constant (forced into a register). >> Right. But do we agree that the two above are equivalent? If they are then >> changing DSE just papers over the combine issue downstream. > > It is true that an instruction like > (insn 8 7 9 2 (set (reg:HI 141) > (subreg:HI (reg:SI 142) 0)) "aauu.c":6:18 181 {*movhi_internal} > (nil)) > can appear in the IL on WORD_REGISTER_OPERATIONS target, but I think the > upper bits shouldn't be random garbage in that case, it should be zero > extended or sign extended. Well, that's one of the core questions here. What are the state of the upper 16 bits of (reg:HI 141)? The WORD_REGISTER_OPERATIONS docs aren't 100% clear as we're not really doing any operation. So again, I think we need to decide if the DSE transformation is correct or not. I *think* we can aggree that insn 39 is OK. It's really the semantics of insn 47 that I think we need to agree on. What is the state of the upper 16 bits of (reg:HI 175) after insn 47? > > What happens in combine is we enter combine.cc (simplify_set) with > (set (reg:HI 175) > (subreg:HI (and:SI (reg:SI 167 [ m ]) > (reg:SI 168)) 0)) > and there trigger the > /* If we have (set x (subreg:m1 (op:m2 ...) 0)) with OP being some operation, > and X being a REG or (subreg (reg)), we may be able to convert this to > (set (subreg:m2 x) (op)). > > We can always do this if M1 is narrower than M2 because that means that > we only care about the low bits of the result. > > However, on machines without WORD_REGISTER_OPERATIONS defined, we cannot > perform a narrower operation than requested since the high-order bits will > be undefined. On machine where it is defined, this transformation is safe > as long as M1 and M2 have the same number of words. */ > transformation into: > (set (subreg:SI (reg:HI 175) 0) > (and:SI (reg:SI 167 [ m ]) > (reg:SI 168))) I think we're OK a this point. > Though, it is !paradoxical_subreg_p (src) in that case, so it is done > regardless of WORD_REGISTER_OPERATIONS I think. > > Then after that try_combine we do: > 13325 record_value_for_reg (dest, record_dead_insn, > 13326 WORD_REGISTER_OPERATIONS > 13327 && word_register_operation_p (SET_SRC (setter)) > 13328 && paradoxical_subreg_p (SET_DEST (setter)) > 13329 ? SET_SRC (setter) > 13330 : gen_lowpart (GET_MODE (dest), > 13331 SET_SRC (setter))); > and the 3 conditions are true here and so record value of the whole setter. > That then records among other things nonzero_bits as 0x8084c. > > Next when trying to combine > (insn 39 47 40 2 (set (reg:SI 171) > (zero_extend:SI (reg:HI 175))) "pr109040.c":10:11 111 {*zero_extendhisi2} > (expr_list:REG_DEAD (reg:HI 175) > (nil))) > into > (insn 40 39 43 2 (set (reg:SI 172) > (leu:SI (reg:SI 171) > (const_int 5 [0x5]))) "pr109040.c":10:11 291 {*sleu_sisi} > (expr_list:REG_DEAD (reg:SI 171) > (nil))) > we i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0); > and that correctly simplifies it into > (and:SI (subreg:SI (reg:HI 175) 0) > (const_int 2124 [0x84c])) Right. Still seems sane at this point. > We substitute that > (leu:SI (and:SI (subreg:SI (reg:HI 175) 0) > (const_int 2124 [0x84c])) > (const_int 5 [0x5])) > but then trigger the WORD_REGISTER_OPERATIONS block in simplify_comparison: > /* If this is (and:M1 (subreg:M1 X:M2 0) (const_int C1)) where C1 > fits in both M1 and M2 and the SUBREG is either paradoxical > or represents the low part, permute the SUBREG and the AND > and try again. */ > if (GET_CODE (XEXP (op0, 0)) == SUBREG > && CONST_INT_P (XEXP (op0, 1))) > { > unsigned HOST_WIDE_INT c1 = INTVAL (XEXP (op0, 1)); > /* Require an integral mode, to avoid creating something like > (AND:SF ...). */ > if ((is_a > (GET_MODE (SUBREG_REG (XEXP (op0, 0))), &tmode)) > /* It is unsafe to commute the AND into the SUBREG if the > SUBREG is paradoxical and WORD_REGISTER_OPERATIONS is > not defined. As originally written the upper bits > have a defined value due to the AND operation. > However, if we commute the AND inside the SUBREG then > they no longer have defined values and the meaning of > the code has been changed. > Also C1 should not change value in the smaller mode, > see PR67028 (a positive C1 can become negative in the > smaller mode, so that the AND does no longer mask the > upper bits). */ > && ((WORD_REGISTER_OPERATIONS > && mode_width > GET_MODE_PRECISION (tmode) > && mode_width <= BITS_PER_WORD > && trunc_int_for_mode (c1, tmode) == (HOST_WIDE_INT) c1) > || (mode_width <= GET_MODE_PRECISION (tmode) > && subreg_lowpart_p (XEXP (op0, 0)))) > && mode_width <= HOST_BITS_PER_WIDE_INT > && HWI_COMPUTABLE_MODE_P (tmode) > && (c1 & ~mask) == 0 > && (c1 & ~GET_MODE_MASK (tmode)) == 0 > && c1 != mask > && c1 != GET_MODE_MASK (tmode)) > { > op0 = simplify_gen_binary (AND, tmode, > SUBREG_REG (XEXP (op0, 0)), > gen_int_mode (c1, tmode)); > op0 = gen_lowpart (mode, op0); > continue; > } > } > c1 is 0x84c. I believe this is the exact spot where things go wrong, > and is because for WORD_REGISTER_OPERATIONS we assume something that the > DSE added instruction didn't guarantee. pan2.li zero'd in on the same block of code. The leap I'm strugging with is the assertion that this combine code assumes something that the DSE added instruction does not guarantee. That's why I asked if we agreed that the before/after from DSE was correct or not. I think that's still the outstanding question. Jeff