From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x82f.google.com (mail-qt1-x82f.google.com [IPv6:2607:f8b0:4864:20::82f]) by sourceware.org (Postfix) with ESMTPS id BD3013858C98; Sun, 24 Dec 2023 05:27:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BD3013858C98 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BD3013858C98 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::82f ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703395642; cv=none; b=bh1MUp0KGD+krKLP0A1Z8XpahWJAUlL0bxhtBiwtkR2Eip0L+Cc9xDLm05f0vIFzyylAdhVLN8QyLe8q0d5TZ/NPIHNAzCzvIX3lhI9YouakkFTrPg1oE4Bw6W7d7sKRZVFgUTBzlFGICQi3Q6fufEhB3zm1y1nJQhd7E1tnxl0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703395642; c=relaxed/simple; bh=vVx1boR5efNLlPxd7paSGWxtzYbsbVBjZtLVF4qm3P4=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=nIrVYCymZM8kONt3S8WVxdR+vlp/EX/UVzgwwxW8584gA8w5BRCbci+5sV6Lrsp1xiZCjHcMEfmRih3C/hDTeBWhuYfWubH2Tgjy17r9lVTXHLhkuAftQrU3hR0h3uFkbIQRIu/HjifvcC8A6Fnvg3Ih6egRO7o68BKDbu9Yzkk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-qt1-x82f.google.com with SMTP id d75a77b69052e-42795ea3e35so26231851cf.2; Sat, 23 Dec 2023 21:27:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1703395639; x=1704000439; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=gCN9Ey0MQBmLTznUNOzHBdSVGixRDol/i5SNF24LGZ0=; b=R17Uwb+cUxqDPVgcOfZXV4Fuwb8dpFvEgk7/gb3sFftV7wG9oZtOBK5S1ve0Wto4/O G9AMZrVpUBWurbmVdHPThMkvvzqXbcWQYeUF4C67dZFV/xq//405Ae1T0OFkDkf9Ahqj EZXCR2G9dQ9bzIURAaIwZn4hqO+ECeCg6ZtSDbB4mKrNYekgDPmbV1sWluGSuwfN10wS oy65R/y9sOEjbC8bBUp6MSGPfHrkoTq4urDHKBBmDxNKOoPGkjgyqT4eZWjz8BQC0ZQx +K/S7LnG7Yifx1A3nuDUUjDz29jp7+HQ4+MULgYtSNEq2iYtkLMXj+APm6tax6VH7lq/ dWMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703395639; x=1704000439; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gCN9Ey0MQBmLTznUNOzHBdSVGixRDol/i5SNF24LGZ0=; b=hF1oTKDywN2jCK0A8fYCgQuT61o2nnItTAIzT0KqQV8PRsfhMurh3mRiBoTlYQwwQz p5oLhYjvrpOMQHVm/H1E7w6KNyV8G54TjXaFJjhBSlCBfC6F953GXw7vqGu+JvXPvLd3 UBTpX+V+QLLVum/1JDbnkFwVErccLeExB33J+uSg/kK6OzLXgLR/fqqxpBsyEj75jMe9 rsBzRvEgy1nS0RNIgah7F71opGjML+IK5HWic8YwEb0ZDtfgSepmXm3/rts7SGNqQ1Qj 8QU9vrCF8Q4kbuA/8Dci1Ckl1GiFEINCDl77QXlgFfdJs2Cj3ZBOgGE/kxJMuhR6Zepg Phog== X-Gm-Message-State: AOJu0YwidsDvik/f9RLdYkPO/DsGhmpP53gn1iI43+NJ+1kI+fUU96zo mU1wLy5N4K5gkoF6Js1dd84xROHUFOg= X-Google-Smtp-Source: AGHT+IG1jeo+byFu9L9A33bow7ERZJP6TSVkgCHjxVAp7dXiRFJVWCTLWEbJF32zWHzZwrwobGQCzA== X-Received: by 2002:ac8:5f84:0:b0:423:6f53:75bb with SMTP id j4-20020ac85f84000000b004236f5375bbmr5762148qta.4.1703395638785; Sat, 23 Dec 2023 21:27:18 -0800 (PST) Received: from [172.31.1.103] ([172.56.168.179]) by smtp.gmail.com with ESMTPSA id i12-20020ac85c0c000000b00425f0ab0393sm3447492qti.17.2023.12.23.21.27.16 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 23 Dec 2023 21:27:18 -0800 (PST) Message-ID: Date: Sat, 23 Dec 2023 22:27:15 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode Content-Language: en-US To: YunQiang Su Cc: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com, pinskia@gmail.com, rguenther@suse.de References: <20231223085858.4136369-1-syq@gcc.gnu.org> <04a01582-2bff-496f-95b1-4643b5a2f494@gmail.com> From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=2.1 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: ** X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 12/23/23 15:46, YunQiang Su wrote: > Jeff Law 于2023年12月24日周日 00:51写道: >> >> >> >> On 12/23/23 01:58, YunQiang Su wrote: >>> On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms, >>> if 31 or above bits is polluted by an bitops, we will need an >>> truncate. Let's emit one, and mark let's use the same hardreg >>> as in and out, the RTL may like: >>> >>> (insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0) >>> (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1 >>> (nil)) >>> >>> We use /s/u flags to mark it as really needed, as in >>> combine_simplify_rtx, this insn may be considered as truncated, >>> so let's skip this combination. >>> >>> gcc/ChangeLog: >>> PR: 104914. >>> * combine.cc (try_combine): Skip combine with truncate if >>> dest is subreg and has /u/s flags on platforms >>> TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true. >>> * expr.cc (expand_assignment): Emit a truncate insn, if >>> 31+ bits is polluted for SImode. >>> >>> gcc/testsuite/ChangeLog: >>> PR: 104914. >>> * gcc.target/mips/pr104914.c: New testcase. >> I would suggest you show the RTL before/after whatever transformation >> has caused problems on your target and explain why you think the >> transformation is incorrect. >> > > Before this patch, the RTL is like this > (insn 19 18 20 2 (set (zero_extract:DI (reg/v:DI 200 [ val ]) > (const_int 8 [0x8]) > (const_int 24 [0x18])) > (subreg:DI (reg:QI 205) 0)) "../xx.c":7:29 -1 > (nil)) > (insn 20 19 23 2 (set (reg/v:DI 200 [ val ]) > (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) > "../xx.c":7:29 -1 > (nil)) > (jump_insn 23 20 24 2 (set (pc) > (if_then_else (lt (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0) > (const_int 0 [0])) > (label_ref 32) > (pc))) "../xx.c":10:5 -1 > (int_list:REG_BR_PROB 440234148 (nil)) > -> 32) > > and then, when combine > (insn 20 19 23 2 (set (reg/v:DI 200 [ val ]) > (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) > "../xx.c":7:29 -1 > (nil)) > will be convert to > (note 20 19 23 2 NOTE_INSN_DELETED) > MIPS claims TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true > based on that the hard register is always sign-extended, but here > the hard register is polluted by zero_extract. > > If we just patch combine.cc to make it not eat sign_extend, here, > sign_extend will still disappear in the later passes, due to mips define > sign_extend as "emit_note (NOTE_INSN_DELETED)". > > So I tried to insert a new truncate RTX here, > (insn 21 20 24 2 (set (reg/v:DI 200 [ val ]) > (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1 > (nil)) > This is the RTL for this C code > int32_t fun (int64_t arg) { > int32_t a = (int32_t) arg; > return a; > } > But, the `reload` pass will get an ICE. I haven't dig the real problem. > If the new RTX is > (insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0) > (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1 > (nil)) > `reload` pass will happily accept it, and then it is converted to > # this instruction will be sure the reg is well sign extended. > `sll $rN, $rN, 0` > hard instruction. > > The problem is that simple-rtx (called by combine) will believe that > REG 200 has been truncated to SImode, as the dest has an > subreg:SI. > > So, I use /s/u flags to tell combine don't do so. > >> Focus on the RTL semantics as well as the target specific semantics >> because both are critically important here. >> >> I strongly suspect you're just papering over a problem elsewhere. >> > > Yes. I also guess so. Any new idea? Well, I see multiple intertwined issues and I think MIPS has largely mucked this up. At a high level DI -> SI truncation is not a nop on MIPS64. We must explicitly sign extend the value from SI->DI to preserve the invariant that SI mode objects are extended to DImode. If we fail to do that, then the SImode conditional branch patterns simply aren't going to work. What doesn't make sense to me is that for truncation, the output mode is going to be smaller than the input mode. Which makes logical sense and is codified in the documentation: > @deftypefn {Target Hook} bool TARGET_TRULY_NOOP_TRUNCATION (poly_uint64 @var{outprec}, poly_uint64 @var{inprec}) > This hook returns true if it is safe to ``convert'' a value of > @var{inprec} bits to one of @var{outprec} bits (where @var{outprec} is > smaller than @var{inprec}) by merely operating on it as if it had only > @var{outprec} bits. The default returns true unconditionally, which > is correct for most machines. When @code{TARGET_TRULY_NOOP_TRUNCATION} > returns false, the machine description should provide a @code{trunc} > optab to specify the RTL that performs the required truncation. Yet the implementation in the mips backend: > static bool > mips_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec) > { > return !TARGET_64BIT || inprec <= 32 || outprec > 32; > } Can you verify what values are getting in here? If we're being called with inprec as 32 and outprec as 64, we're going to return true which makes absolutely no sense at all. Jeff