From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id C395138582AB for ; Thu, 2 Feb 2023 12:26:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C395138582AB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 866BE219BC; Thu, 2 Feb 2023 12:26:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1675340807; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OCj4s+z2szy6+mV4n4T0ArY6kjIWdqqUFP1SAR+Orh8=; b=wT4CH+2Og2/w6DWQqL9KClr6l2jh71jqZlZ2+LbHerhEsRAVcDuvDcmw2ZCu2p5VWU00Gx WYRFIPx1CnJbJ7JCa80T3VusgTeK5OBka/ygSxv7DipTfhx2T31SujXDt+ZqXJaiTqMYAa pGTkUw5PMa14KOEFbTl+qtRe8+2rQMY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1675340807; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OCj4s+z2szy6+mV4n4T0ArY6kjIWdqqUFP1SAR+Orh8=; b=1TSFWlZjgdXu01/912aN5psaIiXT0ttoLgqb3UXqofkFIg9RI5x/Itk/W1YRKdP86oR5+X V9mQM+yKih/UO1Bw== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 505422C141; Thu, 2 Feb 2023 12:26:47 +0000 (UTC) Date: Thu, 2 Feb 2023 12:26:47 +0000 (UTC) From: Richard Biener To: "juzhe.zhong@rivai.ai" cc: gcc-patches , "kito.cheng" , "richard.sandiford" , jeffreyalaw , apinski Subject: Re: Re: [PATCH] CPROP: Allow cprop optimization when the function has a single block In-Reply-To: Message-ID: References: <2E16F446920AC99F+2023020120563582688674@rivai.ai>, <0CEDD257-703F-4ED3-B927-C21279557CE6@gmail.com> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 2 Feb 2023, juzhe.zhong@rivai.ai wrote: > Yeah, Thanks. You are right. CSE should do the job. > Now I know the reason CSE failed to optimize is I include VL_REGNUM(66)/VTYPE_RENUM(67) hard reg > as the dependency of pred_broadcast: > (insn 19 18 20 4 (set (reg:VNx1DI 152) > > (if_then_else:VNx1DI (unspec:VNx1BI [ > > (const_vector:VNx1BI repeat [ > > (const_int 1 [0x1]) > > ]) > > (const_int 4 [0x4]) > > (const_int 2 [0x2]) repeated x2 > > (const_int 0 [0]) > > (reg:SI 66 vl) > > (reg:SI 67 vtype) > > ] UNSPEC_VPREDICATE) > > (vec_duplicate:VNx1DI (reg/v:DI 148 [ x ])) > > (unspec:VNx1DI [ > > (const_int 0 [0]) > > ] UNSPEC_VUNDEF))) "rvv.c":22:23 695 {pred_broadcastvnx1di} > > (nil)) > Then CSE failed to set the 152 as copy. > > VL_REGNUM(66)/VTYPE_RENUM(67) are the global hard reg that I should make each RVV instruction depend on them. > Since we use vsetvl instruction (which is setting global VL_REGNUM(66)/VTYPE_RENUM(67) status) to set the global status for > each RVV instruction. > Including the dependency here is to make sure the global VL/VTYPE status is correct of each RVV instruction. (If we don't include > such dependency in RVV instruction, instruction scheduling may move the RVV instructions and vsetvl instructions randomly then > produce incorrect vsetvl configuration) > > The original reg_class of VL_REGNUM(66)/VTYPE_RENUM(67) I set here: > riscv_regno_to_class [VL_REGNUM] = VL_REGS; > riscv_regno_to_class [VTYPE_RENUM] = VTYPE_REGS; > Such configuration make CSE failed. > > However, if I change the reg_class : > riscv_regno_to_class [VL_REGNUM] = NO_REGS; > riscv_regno_to_class [VTYPE_RENUM] = NO_REGS; > The CSE now can do the optimization now! > > 1) Would you mind telling me the difference between them? No idea. I think CSE avoids to touch hard register references because eliding them to copies can increase register pressure. > 2) If I set these 2 global status register as NO_REGS, will it create > issues for the global status configuration of each RVV instructions ? No idea either. Usually these kind of dependences are introduced by targets at the point the VL setting is introduced to avoid pessimizing optimizations earlier. Often, for cases like a VL register, this is done after register allocation only and indeed necessary to avoid the second scheduling pass from breaking things. Richard.