From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt1-x82f.google.com (mail-qt1-x82f.google.com [IPv6:2607:f8b0:4864:20::82f]) by sourceware.org (Postfix) with ESMTPS id 90CF338F860C for ; Mon, 23 May 2022 12:35:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 90CF338F860C Received: by mail-qt1-x82f.google.com with SMTP id v6so9363764qtx.12 for ; Mon, 23 May 2022 05:35:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ugqL2nZnhmFfZAjF7wqtr1udocRiUCufMKmXPRQf8io=; b=YZn1ri1+6pYgIbJEN42YmMP15qKXWV19L6iLYbwZMEdzuBjtdq/J238dL8PVyrYme7 wDyjWYSX1h1hED70zCcZU5W+/CMr8UQMnatZfasnNmi8YzEOQKMXI5KQJyFItC8UQSqD vk5aSshkBIQE2mhMUrVk+856NyBs25Xe0PhGrYaLl+17bjc7oyHVHvkThtsTEQ3C8skx joOntiarkif7El0iBzVcd4J6oakczr0BkmNyZwDbGOIFOdWKbDnkyAKOV/CS59b2Vhpa AqHhyRq5Sd1GJgVcrlesUO4o9KwMnmHk5K4nUv6NsWFXRhkGRIWTeBdr+21KQdKa4V3I vM9w== X-Gm-Message-State: AOAM5319IZhW6lllsjPjqKmSXlEV95F19JXVxzV1fFuG046pGLJZTLOW VcmbMDsk6Mhq7lL7XdxyDMXj1bJRkwf6yfmH34gIpfU7 X-Google-Smtp-Source: ABdhPJwE/EyOkoQ4hdifAT5nb/W1Bjz4qRJn93OuFXTQDGcY8Mg0suv263A6LiwjIrgaHduSifDZhDHnB5cJmlPThNs= X-Received: by 2002:a05:622a:cc:b0:2f9:e34:ead7 with SMTP id p12-20020a05622a00cc00b002f90e34ead7mr16098041qtw.581.1653309355811; Mon, 23 May 2022 05:35:55 -0700 (PDT) MIME-Version: 1.0 References: <011301d86e70$814ba340$83e2e9c0$@nextmovesoftware.com> In-Reply-To: <011301d86e70$814ba340$83e2e9c0$@nextmovesoftware.com> From: Richard Biener Date: Mon, 23 May 2022 14:35:44 +0200 Message-ID: Subject: Re: [PATCH] PR tree-optimization/105668: Provide RTL expansion for VEC_COND_EXPR. To: Roger Sayle Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 May 2022 12:35:58 -0000 On Mon, May 23, 2022 at 8:44 AM Roger Sayle wrote: > > > This resolves PR tree-optimization/105668, a P1 ice-on-valid regression > triggered by my recent patch to add a vec_cmpeqv1tiv1ti define_expand > to the i386 backend. The existence of this optab currently leads GCC > to incorrectly assume the existence of a corresponding vcond_mask for > V1TImode. > > I believe the best solution (of the three possible fixes) is to allow > gimple_expand_vec_cond_expr to fail (return NULL) when a suitable optab > to generate a IFN_VCOND_MASK isn't available, but instead allow RTL > expansion to provide a default implementation using vector mode logic > operations. On x86_64, the equivalent of a pblend can be generated in > three instructions using pand, pandn and pxor. In fact, this fallback > implementation is already used in ix86_expand_sse_movcc when the -march > doesn't provide a suitable instruction. This patch provides that > functionality to all targets in the middle-end, allowing the vectorizer(s) > to safely assume support for VEC_COND_EXPR (when the target has suitable > vector logic instructions). > > I should point out (for the record) that the new expand_vec_cond_expr > function in expr.cc is very different from the function of the same name > removed by Matin Liska in June 2020. > https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547097.html > https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=502d63b6d6141597bb18fd23c8 > 7736a1b384cf8f > That function simply expanded the vcond_mask optab and failed if it > wasn't available, which is currently the task of the gimple-isel pass. > The implementation here is a traditional RTL expander, sythesizing the > desired vector conditional move using bit-wise XOR and AND instructions > of the mask vector. > > At some point in the future, gimple-isel could be enhanced to consider > alternative vector modes, as a V1TI blend/vec_cond_expr may be implemented > using V2DI, V4SI, V8HI or V16QI. Alas, I couldn't figure out how to > conveniently iterate over the desired modes, so this enhancement is left > for a follow-up patch. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} with > no new failures. Ok for mainline? No, first of all the purpose of ISEL is to get rid of _all_ VEC_COND_EXPRs. So iff then this fallback would have to reside in the ISEL pass, replacing the GIMPLE with target supported GIMPLE. But then it is the task of tree-vect-generic.cc to turn not target supported GIMPLE into target supported GIMPLE - and the issue in the PR in question is that at its point we basically have _1 < _2 ? _3 : _4 which _is_ supported by the target but passes inbetween vector lowering and ISEL hide _1 < _2 via a PHI node and so the GIMPLE is no longer target supported. That would be the thing to fix - I'll note we put us into the corner of needing to keep the SSA def of the VEC_COND_EXPR condition "next" (as in SSA def) to the VEC_COND_EXPR, something that's difficult to maintain, especially when so man passes run inbetween. So the solution might be to somehow move the two closer together, maybe as much as merging the VEC_COND_EXPR part into vector lowering itself (with the disadvantage of more difficult to deal with IL). Or alternatively have vectors lowered earlier for those produced by user code and have "final" lowering done as part of RTL expansion (so in ISEL then). Richard. > > 2022-05-23 Roger Sayle > > gcc/ChangeLog > PR tree-optimization/105668 > * expr.cc (expand_vec_cond_expr): New function to expand > VEC_COND_EXPR using vector mode logical instructions. > (expand_expr_real_2) : Call the above. > * gimple-isel.cc (gimple_expand_vec_cond_expr): Instead of > asserting, retun NULL when the target's get_vcond_mask_icode > returns CODE_FOR_nothing. > > gcc/testsuite/ChangeLog > PR tree-optimization/105668 > * gcc.target/i386/pr105668.c: New test case. > > Roger > -- >