From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 901963858D3C for ; Thu, 15 Jun 2023 11:12:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 901963858D3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id A66C11FE03; Thu, 15 Jun 2023 11:12:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1686827545; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=J/5io2RdY0bhsfoADTYlDCztLvpOpBvn8RdNdNaP6LA=; b=0TNsce/blWbFQL+2BBQ2QokIkMrNbLWyerhnbR1CbeiXSWDmVKW+qEvIVyEre/9qz71w4x G1bU02Tp45IR9DRtO+l5ON34v3Z96/4Vsu4gMr+Jsr+aM5n8idP7rVFwEqDnTXlD/unOxf DIgZIlXDYqY6KvemmMcvKy10dr847Jg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1686827545; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=J/5io2RdY0bhsfoADTYlDCztLvpOpBvn8RdNdNaP6LA=; b=1rTKhe4qTvof1lyqljpzZi/o20SEwXkLee9ZW6NohhM1lBUccfXTOl7YNHZKbtCwabHaIi FnoXjRvNGsifkCCg== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 7CB522C141; Thu, 15 Jun 2023 11:12:25 +0000 (UTC) Date: Thu, 15 Jun 2023 11:12:25 +0000 (UTC) From: Richard Biener To: Robin Dapp cc: "juzhe.zhong@rivai.ai" , gcc-patches , "richard.sandiford" , krebbel , uweigand , linkw@linux.ibm.com Subject: Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control In-Reply-To: Message-ID: References: <20230612041438.272885-1-juzhe.zhong@rivai.ai> <1A27F1FE56A91998+2023061516473052116877@rivai.ai> <7e31f226-476a-75d1-6a0f-13fa9134e311@gmail.com> <93123844-a2a8-90ab-d47a-6f859fb3a8d6@gmail.com> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 15 Jun 2023, Robin Dapp wrote: > > the minus in 'operand 2 - operand 3' should be a plus if the > > bias is really zero or -1. I suppose > > Yes, that somehow got lost from when the bias was still +1. Maybe > Juzhe can fix this in the course of his patch. > > > that's quite conservative. I think you can do better when the > > loads are aligned, reading an extra byte when ignoring the bias > > is OK and you at least know the very first element is used. > > For stores you would need to emit compare&jump for all but > > the first store of a group though ... > > The implementation is a first shot and yes we could do a bit > better but limiting to a single rgroup is IMHO the more severe > restriction. The pattern wasn't hit very often across SPEC > either way. I think overall proper masking is more important for > fixed-length vectors while length control might be more useful > for variable-length vectors. Just my gut feeling though, you're > the expert there. > > > That said, I'm still not seeing where you actually apply the bias. > > We do > > + > + int partial_load_bias = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo); > + if (partial_load_bias != 0) > + { > + tree adjusted_len = rgc->bias_adjusted_ctrl; > + gassign *minus = gimple_build_assign (adjusted_len, PLUS_EXPR, > + rgc->controls[0], > + build_int_cst > + (TREE_TYPE (rgc->controls[0]), > + partial_load_bias)); > + gimple_seq_add_stmt (header_seq, minus); > + } > + > > as well as > > + if (use_bias_adjusted_len) > + { > + gcc_assert (i == 0); > + tree adjusted_len = > + make_temp_ssa_name (len_type, NULL, "adjusted_loop_len"); > + SSA_NAME_DEF_STMT (adjusted_len) = gimple_build_nop (); > + rgl->bias_adjusted_ctrl = adjusted_len; > + } Ah, OK. It's a bit odd to have predicates on define_expand. The define_expand pattern is expected to only match either literal 0 or literal -1 (and consistently so for all len_ optabs) and thus operand 2, the length, needs to be adjusted by the middle-end to match up with the pattern supplied operand 3. Richard.