From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2a07:de40:b251:101:10:150:64:1]) by sourceware.org (Postfix) with ESMTPS id A68F03858C54 for ; Mon, 5 Feb 2024 09:56:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A68F03858C54 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A68F03858C54 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707127020; cv=none; b=B6fe6nQ/hAyu3DO8k78PF6twSFBkY88U/Z/9l0pTfpc6fJeKfDlezi0oxdRDxHfWmTTSOnRFzGYXL+Ak4YLjILr21H0oXx9RkjV/F6khe+pYQT4EqPOzUQLWWIDEr7Mg5owPEPt+zj7sRN2UKla/NPWNo2zLMGITn7gslRF+D/g= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707127020; c=relaxed/simple; bh=W53mvYEZJ11KQPuNOcQrdv4/3OtEMkY4V/oZnHgu/gA=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=ePknjEHh1kSg0KXhLPzcjlyziCpr94M95coRw0yLHTipK4hzstZp2TVFPtdYr/OX2zVFSY6U3PEHGIga8+IizgB5B9YIT4eTjfS/+bWpTnRByO0lgoN0yAed+vP8HBRQZk7EoELzG8+a+6Cny3DRiQofcVQche1V9DyjyteZ1pg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 6DD5822149; Mon, 5 Feb 2024 09:56:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1707127014; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6FS7vKrNQFT2oRODSRcM9AH5hvePLW0T429vxPW436U=; b=mYXCrRvxgHpuTl9fW3UAA55IMbUO/Z7zKl504i+TIEvz1UOC4Y701DLk4p+ujLzZxl7pi/ Khaj4PNt8K+huEB8T6qL/hpW6WvTv+9hV+1UD4T4u4Qho16toFeCnyKELhG5xfEo77bFlI wCkUV3YUh38u72evqq/NXuSQ9F3SAsg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1707127014; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6FS7vKrNQFT2oRODSRcM9AH5hvePLW0T429vxPW436U=; b=QZ6o1LasijfqjQjTq6msaCX0WAsK2gSTenB1HeY1DcCctUNnObBBKn6uOc1ZXM8IFajfrV /wTOFYjE4q3QdAAg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1707127014; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6FS7vKrNQFT2oRODSRcM9AH5hvePLW0T429vxPW436U=; b=mYXCrRvxgHpuTl9fW3UAA55IMbUO/Z7zKl504i+TIEvz1UOC4Y701DLk4p+ujLzZxl7pi/ Khaj4PNt8K+huEB8T6qL/hpW6WvTv+9hV+1UD4T4u4Qho16toFeCnyKELhG5xfEo77bFlI wCkUV3YUh38u72evqq/NXuSQ9F3SAsg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1707127014; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=6FS7vKrNQFT2oRODSRcM9AH5hvePLW0T429vxPW436U=; b=QZ6o1LasijfqjQjTq6msaCX0WAsK2gSTenB1HeY1DcCctUNnObBBKn6uOc1ZXM8IFajfrV /wTOFYjE4q3QdAAg== Date: Mon, 5 Feb 2024 10:56:49 +0100 (CET) From: Richard Biener To: "Andre Vieira (lists)" cc: gcc-patches@gcc.gnu.org, Richard.Sandiford@arm.com Subject: Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE In-Reply-To: <359e8112-65c9-40b1-9566-aa31165c05e8@arm.com> Message-ID: References: <20240130143132.9575-1-andre.simoesdiasvieira@arm.com> <20240130143132.9575-2-andre.simoesdiasvieira@arm.com> <47e1aeb2-94ac-4733-b49f-ea97932cc49f@arm.com> <545r8s73-675p-4o48-sr66-q6956nqp6r6p@fhfr.qr> <3rq8sn71-8188-o4rq-9spp-q9spn98163q5@fhfr.qr> <359e8112-65c9-40b1-9566-aa31165c05e8@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Authentication-Results: smtp-out1.suse.de; none X-Spam-Level: X-Spam-Score: -4.28 X-Spamd-Result: default: False [-4.28 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.18)[-0.918]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 1 Feb 2024, Andre Vieira (lists) wrote: > > > On 01/02/2024 07:19, Richard Biener wrote: > > On Wed, 31 Jan 2024, Andre Vieira (lists) wrote: > > > > > > The patch didn't come with a testcase so it's really hard to tell > > what goes wrong now and how it is fixed ... > > My bad! I had a testcase locally but never added it... > > However... now I look at it and ran it past Richard S, the codegen isn't > 'wrong', but it does have the potential to lead to some pretty slow codegen, > especially for inbranch simdclones where it transforms the SVE predicate into > an Advanced SIMD vector by inserting the elements one at a time... > > An example of which can be seen if you do: > > gcc -O3 -march=armv8-a+sve -msve-vector-bits=128 -fopenmp-simd t.c -S > > with the following t.c: > #pragma omp declare simd simdlen(4) inbranch > int __attribute__ ((const)) fn5(int); > > void fn4 (int *a, int *b, int n) > { > for (int i = 0; i < n; ++i) > b[i] = fn5(a[i]); > } > > Now I do have to say, for our main usecase of libmvec we won't have any > 'inbranch' Advanced SIMD clones, so we avoid that issue... But of course that > doesn't mean user-code will. It seems to use SVE masks with vector(4) and the ABI says the mask is vector(4) int. You say that's because we choose a Adv SIMD clone for the SVE VLS vector code (it calls _ZGVnM4v_fn5). The vectorizer creates _44 = VEC_COND_EXPR ; and then vector lowering decomposes this. That means the vectorizer lacks a check that the target handles this VEC_COND_EXPR. Of course I would expect that SVE with VLS vectors is able to code generate this operation, so it's missing patterns in the end. Richard. > I'm gonna remove this patch and run another test regression to see if it > catches anything weird, but if not then I guess we do have the option to not > use this patch and aim to solve the costing or codegen issue in GCC-15. We > don't currently do any simdclone costing and I don't have a clear suggestion > for how given openmp has no mechanism that I know off to expose the speedup of > a simdclone over it's scalar variant, so how would we 'compare' a simdclone > call with extra overhead of argument preparation vs scalar, though at least we > could prefer a call to a different simdclone with less argument preparation. > Anyways I digress. > > Other tests, these require aarch64-autovec-preference=2 so that also has me > worried less... > > gcc -O3 -march=armv8-a+sve -msve-vector-bits=128 --param > aarch64-autovec-preference=2 -fopenmp-simd t.c -S > > t.c: > #pragma omp declare simd simdlen(2) notinbranch > float __attribute__ ((const)) fn1(double); > > void fn0 (float *a, float *b, int n) > { > for (int i = 0; i < n; ++i) > b[i] = fn1((double) a[i]); > } > > #pragma omp declare simd simdlen(2) notinbranch > float __attribute__ ((const)) fn3(float); > > void fn2 (float *a, double *b, int n) > { > for (int i = 0; i < n; ++i) > b[i] = (double) fn3(a[i]); > } > > > Richard. > > > >>> > >>> That said, I wonder how we end up mixing things up in the first place. > >>> > >>> Richard. > >> > > > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)