From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 1B3423858C52 for ; Fri, 26 Aug 2022 08:36:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1B3423858C52 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id C568A1F7AB; Fri, 26 Aug 2022 08:36:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1661503000; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=X5ditbi93JKJaT6HdLXMmmL5hby2PpEA2YlBx5BC4Sg=; b=yAwLHSkLBQIH2EQo0yyLgLXAgMX0C3D1yEaf8cMOiYSz3bqPtt2NnwKTBTcF1RofMTB7nC RLfOgVKBv95jnSi+3dK+uDnhf7Fw/P90q7TEZvmcV4D8TMjX7GwFw7tlQXVsoUZA9yaNLV Bf2tdUD/8dVR8SjiAVidpEH9k0PS/F8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1661503000; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=X5ditbi93JKJaT6HdLXMmmL5hby2PpEA2YlBx5BC4Sg=; b=f/pTvfxRUIqTdXLBF7/JUWEpFm7oxj3KOxVDsgqNDgiMwiIuNLr/AepIyIxVTkCdK9zZ+J mLbVePne14l8ROCw== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 9B7B42C141; Fri, 26 Aug 2022 08:36:40 +0000 (UTC) Date: Fri, 26 Aug 2022 08:36:40 +0000 (UTC) From: Richard Biener To: Richard Sandiford cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] vect: Tighten get_related_vectype_for_scalar_type In-Reply-To: Message-ID: References: User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 25 Aug 2022, Richard Sandiford wrote: > Builds of glibc with SVE enabled have been failing since V1DI was added > to the aarch64 port. The problem is that BB SLP starts the (hopeless) > attempt to use variable-length modes to vectorise a single-element > vector, and that now gets further than it did before. > > Initially we tried getting a vector mode with 1 + 1X DI elements > (i.e. 1 DI per 128-bit vector chunk). We don't provide such a mode -- > it would be VNx1DI -- because it isn't a native SVE format. We then > try just 1 DI, which previously failed but now succeeds. > > There are numerous ways we could fix this. Perhaps the most obvious > would be to skip variable-length modes for BB SLP. However, I think > that'd just be kicking the can down the road, since eventually we want > to support BB SLP and VLA vectors using predication. > > However, if we do use VLA vectors for BB SLP, the vector modes > we use should actually be variable length. We don't want to use > variable-length vectors for some element types/group sizes and > fixed-length vectors for others, since it would be difficult > to handle the seams. > > The same principle applies during loop vectorisation. We can't > use a mixture of variable-length and fixed-length vectors for > the same loop because the relative unroll/vectorisation factors > would not be constant (compile-time) multiples of each other. > > This patch therefore makes get_related_vectype_for_scalar_type > check that the provided number of units is interoperable with > the provided prevailing mode. The function is generally quite > forgiving -- it does basic things like checking for scalarness > itself rather than expecting callers to do them -- so the new > check feels in keeping with that. > > This seems to subsume the fix for PR96974. I'm not sure it's > worth reverting that code to an assert though, so the patch just > drops the scan for the associated message. > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? OK. Thanks, Richard. > Richard > > > gcc/ > * tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Check > that the requested number of units is interoperable with the requested > prevailing mode. > > gcc/testsuite/ > * gcc.target/aarch64/sve/slp_15.c: New test. > * g++.target/aarch64/sve/pr96974.C: Remove scan test. > --- > gcc/testsuite/g++.target/aarch64/sve/pr96974.C | 4 +--- > gcc/testsuite/gcc.target/aarch64/sve/slp_15.c | 17 +++++++++++++++++ > gcc/tree-vect-stmts.cc | 10 ++++++++++ > 3 files changed, 28 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_15.c > > diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C > index 54000f568ab..2f6ebd6ce3d 100644 > --- a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C > +++ b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-Ofast -march=armv8.2-a+sve -fdisable-tree-fre4 -fdump-tree-slp-details" } */ > +/* { dg-options "-Ofast -march=armv8.2-a+sve -fdisable-tree-fre4" } */ > > float a; > int > @@ -14,5 +14,3 @@ struct c { > } > int coeffs[10]; > } f; > - > -/* { dg-final { scan-tree-dump "Not vectorized: Incompatible number of vector subparts between" "slp1" { target lp64 } } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c > new file mode 100644 > index 00000000000..23f6d567cc5 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c > @@ -0,0 +1,17 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-O3" } */ > + > +struct foo > +{ > + void *handle; > + void *arg; > +}; > + > +void > +dlinfo_doit (struct foo *args) > +{ > + __UINTPTR_TYPE__ **l = args->handle; > + > + *(__UINTPTR_TYPE__ *) args->arg = 0; > + *(__UINTPTR_TYPE__ *) args->arg = **l; > +} > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index c9dab217f05..7748c42c70f 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -11486,6 +11486,16 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, > > unsigned int nbytes = GET_MODE_SIZE (inner_mode); > > + /* Interoperability between modes requires one to be a constant multiple > + of the other, so that the number of vectors required for each operation > + is a compile-time constant. */ > + if (prevailing_mode != VOIDmode > + && !constant_multiple_p (nunits * nbytes, > + GET_MODE_SIZE (prevailing_mode)) > + && !constant_multiple_p (GET_MODE_SIZE (prevailing_mode), > + nunits * nbytes)) > + return NULL_TREE; > + > /* For vector types of elements whose mode precision doesn't > match their types precision we use a element type of mode > precision. The vectorization routines will have to make sure > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)