From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by sourceware.org (Postfix) with ESMTPS id 281A83858D1E for ; Wed, 17 Aug 2022 11:31:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 281A83858D1E Received: by mail-ed1-x52c.google.com with SMTP id z20so17055754edb.9 for ; Wed, 17 Aug 2022 04:31:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc; bh=luXBMRaf764y3K4lEYu7LT5G/Q+klJ86/qq97+u4iEs=; b=d0MZKh6uabemxDNwB7vXfPGeL4R21b9aYuHrS+CtXsH3qTq19pOlJSDhYOgcJBW97L G/aw6UKIiQN7v4tKHLgAmsVQv4sCbisAR36gfV2u/LSpzwvgXNl+inXZpwnOvLESHJrf 0h+moknVJU8KV2qJcFk4ILyXISurMepTD/d8zm//LigSk6VB7rF3Bfta5VYy3AEkzbbO ulm6IBNPM5srykOvbm4cPU8dQaQCRpaYsATdyOxgJRgx7RsKATTboj+FApI8zf07AHe+ uM2Mq9hgGGoNK6EwIxXC2jTpIptkBJXSoOAtx3WFory2Q+sWTdFSteQosf2YSs/oyPIT vnjg== X-Gm-Message-State: ACgBeo3JFd+UP7n18ePpyQFQsEtufnbsb6wjDVXjLbB9ziblqqxb/8y8 l5JEqCrRc2OSRaUplNqecD3dBXKwnODKd+zcRLRc2XH/0H0= X-Google-Smtp-Source: AA6agR6w6Xs96ugepfrl4yK28VEKcZDsuqug9NJdMoy81YFmSkW6/ciwDHV33dJW+sQlL/YwkS7TpABsp1F2TmG9COA= X-Received: by 2002:a05:6402:43c3:b0:445:db76:de71 with SMTP id p3-20020a05640243c300b00445db76de71mr2226170edc.218.1660735917875; Wed, 17 Aug 2022 04:31:57 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Wed, 17 Aug 2022 13:31:45 +0200 Message-ID: Subject: Re: ICE after folding svld1rq to vec_perm_expr duing forwprop To: Prathamesh Kulkarni , Richard Biener , gcc Patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Aug 2022 11:32:01 -0000 On Tue, Aug 16, 2022 at 6:30 PM Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > > On Tue, 9 Aug 2022 at 18:42, Richard Biener wrote: > >> > >> On Tue, Aug 9, 2022 at 12:10 PM Prathamesh Kulkarni > >> wrote: > >> > > >> > On Mon, 8 Aug 2022 at 14:27, Richard Biener w>> > > > >> > > > >> > > /* If result vector has greater length than input vector, > >> > > + then allow permuting two vectors as long as: > >> > > + a) sel.nelts_per_pattern == 1 > >> > > + b) sel.npatterns == len of input vector. > >> > > + The intent is to permute input vectors, and > >> > > + dup the elements in resulting vector to target vector length. */ > >> > > + > >> > > + if (maybe_gt (TYPE_VECTOR_SUBPARTS (type), > >> > > + TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0)))) > >> > > + { > >> > > + nelts = sel.encoding ().npatterns (); > >> > > + if (sel.encoding ().nelts_per_pattern () != 1 > >> > > + || (!known_eq (nelts, TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0))))) > >> > > + return NULL_TREE; > >> > > + } > >> > > > >> > > so the only case you add is non-VLA to VLA and there > >> > > explicitely only the case of a period that's same as the > >> > > element count in the input vectors. > >> > > > >> > > > >> > > @@ -2602,6 +2602,9 @@ dump_generic_node (pretty_printer *pp, tree > >> > > node, int spc, dump_flags_t flags, > >> > > pp_space (pp); > >> > > } > >> > > } > >> > > + if (VECTOR_TYPE_P (TREE_TYPE (node)) > >> > > + && !TYPE_VECTOR_SUBPARTS (TREE_TYPE (node)).is_constant ()) > >> > > + pp_string (pp, ", ... "); > >> > > pp_right_brace (pp); > >> > > > >> > > btw, I do wonder if VLA CONSTRUCTORs are a "thing"? Are they? > >> > Well, it got created for the following case after folding: > >> > svint32_t f2(int a, int b, int c, int d) > >> > { > >> > int32x4_t v = {a, b, c, d}; > >> > return svld1rq_s32 (svptrue_b8 (), &v[0]); > >> > } > >> > > >> > The svld1rq_s32 call gets folded to: > >> > v = {a, b, c, d} > >> > lhs = VEC_PERM_EXPR > >> > > >> > fold_vec_perm then folds the above VEC_PERM_EXPR to > >> > VLA constructor, since elements in v (in_elts) are not constant, and > >> > need_ctor is thus true: > >> > lhs = {a, b, c, d, ...} > >> > I added "..." to make it more explicit that it's a VLA constructor. > >> > >> But I doubt we do anything reasonable with such a beast? Do we? > >> I suppose it's like a vec_duplicate if you view it as V1TImode > >> but do we actually make sure to do this duplication? > > I am not sure. As mentioned above, the current code-gen for VLA > > constructor looks pretty bad. > > Should we avoid folding VLA constructors for now ? > > VLA constructors aren't really a thing. At least, the only VLA vector > you could represent with current CONSTRUCTOR nodes is a fixed-length > sequence at the start of an otherwise zero vector. I'm not sure > we even use that though (perhaps we do and I've forgotten). > > > I guess these are 2 different issues: > > (a) Resolving ICE with VEC_PERM_EXPR for above aarch64 tests. > > (b) Extending fold_vec_perm to handle vectors with differing lengths. > > > > For (a), I think the issue with using: > > res_type = gimple_assign_lhs (stmt) > > in previous patch, was that op2's type will change to match tgt_units, > > if we go thru > > (code == VIEW_CONVERT_EXPR || code2 == VIEW_CONVERT_EXPR) branch, > > and may thus not be same as len(lhs_type) anymore, and hit the assert > > in fold_vec_perm. > > > > IIUC, for lhs = VEC_PERM_EXPR, we now have the > > following semantics: > > (1) Element types for lhs, rhs1 and rhs2 should be the same. > > (2) len(lhs) == len(mask) and len(rhs1) == len(rhs2). > > Yeah. > > > The attached patch changes res_type from TREE_TYPE (arg0) to following: > > res_type = build_vector_type (TREE_TYPE (TREE_TYPE (arg0)), > > TYPE_VECTOR_SUBPARTS (op2)) > > so it has same element type as arg0 (and arg1) and len of op2. > > Does that look reasonable ? > > > > If we need a cast from res_type to lhs_type, then both would be fixed > > width vectors > > with len(lhs_type) being a multiple of len(res_type). > > IIUC, we don't support casting from VLA vector to/from fixed width vector, > > Yes, that's not supported as a cast. If the compiler knows the > length of the "VLA" vector then it's not VLA. If it doesn't > know the length of the VLA vector then the sizes could be different > (preventing VIEW_CONVERT_EXPR) and the number of elements could be > different (preventing pointwise CONVERT_EXPRs). > > > or from VLA vector of one type to VLA vector of other type ? > > That's supported though. They work just like VLS vectors: if the sizes > are the same then we can use VIEW_CONVERT_EXPR, if the number of elements > are the same then we can do pointwise conversions (e.g. element-by-element > extensions, truncations, conversions to float, conversions to integer, etc). > > > Currently, if op2 is VLA, and we enter the branch: > > (code == VIEW_CONVERT_EXPR || code2 == VIEW_CONVERT_EXPR) > > then I think it will bail out because op2_units will not be a compile > > time constant, > > and constant_multiple_p (op2_units, tgt_units, &factor) would return false. > > > > The patch resolves ICE's for aarch64 cases, without regressing the > > ones in PR106360. > > Does it look OK ? > > Richi should have the final say, but it looks OK in principle to me. > I'm not sure it's worth applying independently of the follow-on patch > though, if that patch is in the offing anyway. > > I guess my only question is whether tree-ssa-forwprop.cc really needs > to build a new type. Couldn't it just use the TREE_TYPE of the lhs > instead? I think the point was they are not necssarily the same when we looked through a VIEW_CONVERT? A comment might be in order here. Btw, please don't add new asserts that trivially hold in critical paths. diff --git a/gcc/match.pd b/gcc/match.pd index 330c1db0c8e..aa20cc713c5 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -7845,6 +7845,12 @@ and, (with { tree op0 = @0, op1 = @1, op2 = @2; + + gcc_assert (known_eq (TYPE_VECTOR_SUBPARTS (type), + TYPE_VECTOR_SUBPARTS (TREE_TYPE (op2)))); + gcc_assert (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (op0)), + TYPE_VECTOR_SUBPARTS (TREE_TYPE (op1)))); + machine_mode result_mode = TYPE_MODE (type); machine_mode op_mode = TYPE_MODE (TREE_TYPE (op0)); > > Thanks, > Richard