From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 6430B3858D1E for ; Fri, 10 Nov 2023 11:53:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6430B3858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6430B3858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:67c:2178:6::1c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699617221; cv=none; b=uPRs2ydVWQHVGVuDgDYlrjOpR/DCVgUQC++HshEsnwFjf7L1k9osfWLmsrdddETolYnHGD+Y/ne525lIeNpfZq5NxitrKTO2WjqbnifdwKOIgaX80kcg0fhky5PxLmMds7kkXIwJ+XcFP+dOIEMnzJXWEvcSXZJ9qOzPYHzB5/k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699617221; c=relaxed/simple; bh=UEIu+aiif3b9BOqI9G8elIPkBukZYVnnCMr1EB7kcIE=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=kuRC+J/NSMowuQqLO4EtG4DBna1PjBguJKqJcRpYfdwufwmzztr2Fwt2LgGp7wNbd9zOdXcmsF0npi959u5APuzPoKSgs1OQyKY9hrkWMcVMp0VPExf4uce/aV8LM0mnhvElDUFlb7h+RwYeqYWAihEV7cuSK/9wZDbqNdJxr/o= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id B233A21959; Fri, 10 Nov 2023 11:53:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1699617218; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Q2a050MQ2raUA7w0s5meSjcmUQuKR1FsdtoGAl2h+0E=; b=fS3RNDfxnwshUZcnd/IwsjfOezboVMTuyCohmLmgAIcBgXGRP4MIiB8uW9MXsOWZRtpj2I 47K0YoUFQ3L+tnwmBzJNdjQwrp4w4qtHXaCbt4Fn8hknd9lIepR8xYV0cSUYwijeo6hT6k j7oeqFZunPPWgG1XyUgiBoWX6CWIzac= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1699617218; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Q2a050MQ2raUA7w0s5meSjcmUQuKR1FsdtoGAl2h+0E=; b=IYfCbYHjLHzdbURP/dDDel+D45LfseHHF9XrWHpCJzaW7bZJk9AudVZ7SPpTIRIyTPSpj6 mrNeJckx5TdcrACg== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 840452C4DA; Fri, 10 Nov 2023 11:53:38 +0000 (UTC) Date: Fri, 10 Nov 2023 11:53:38 +0000 (UTC) From: Richard Biener To: Andrew Stubbs cc: "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH] vect: Don't set excess bits in unform masks In-Reply-To: Message-ID: References: <63e907af-cde8-4f63-bba9-d39fcd5623fb@codesourcery.com> User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 10 Nov 2023, Andrew Stubbs wrote: > On 23/10/2023 11:43, Richard Biener wrote: > > On Fri, 20 Oct 2023, Andrew Stubbs wrote: > > > >> This patch fixes a wrong-code bug on amdgcn in which the excess "ones" in > >> the > >> mask enable extra lanes that were supposed to be unused and are therefore > >> undefined. > >> > >> Richi suggested an alternative approach involving narrower types and then a > >> zero-extend to the actual mask type. This solved the problem for the > >> specific > >> test case that I had, but I'm not sure if it would work with V2 and V4 > >> modes > >> (not that I've observed bad behaviour from them anyway, but still). There > >> were some other caveats involving "two-lane division" that I don't fully > >> understand, so I went with the simpler implementation. > >> > >> This patch does have the disadvantage of an additional "and" instruction in > >> the non-constant case even for machines that don't need it. I'm not sure > >> how > >> to fix that without an additional target hook. (If GCC could use the > >> 64-lane > >> vectors more effectively without the assistance of artificially reduced > >> sizes > >> then this problem wouldn't exist.) > >> > >> OK to commit? > > > > - convert_move (target, op0, 0); > > + rtx tmp = gen_reg_rtx (mode); > > + convert_move (tmp, op0, 0); > > + > > + if (known_ne (TYPE_VECTOR_SUBPARTS (type), > > + GET_MODE_PRECISION (mode))) > > > > Usually this would be maybe_ne, but then ... > > > > + { > > + /* Ensure no excess bits are set. > > + GCN needs this, AVX does not. */ > > + expand_binop (mode, and_optab, tmp, > > + GEN_INT ((1 << (TYPE_VECTOR_SUBPARTS (type) > > + .to_constant())) - 1), > > + target, true, OPTAB_DIRECT); > > > > here you have .to_constant (). I think with having an integer mode > > we know subparts is constant so I'd prefer > > > > auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant (); > > if (maybe_ne (GET_MODE_PRECISION (mode), nunits) > > ... > > > > + } > > + else > > + emit_move_insn (target, tmp); > > > > note you need the emit_move_insn also for the expand_binop > > path since it's not guaranteed that 'target' is used there. Thus > > > > tmp = expand_binop (...) > > if (tmp != target) > > emit_move_insn (...) > > > > Otherwise looks good to me. The and is needed on x86 for > > two and four bit masks, it would be more efficient to use > > smaller modes for the sign-extension I guess. > > I think this patch addresses these issues. I've confirmed it does the right > thing on amdgcn. > > OK? OK. thanks, Richard. > Andrew > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)