From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x12c.google.com (mail-lf1-x12c.google.com [IPv6:2a00:1450:4864:20::12c]) by sourceware.org (Postfix) with ESMTPS id 82E92385E008 for ; Tue, 19 Dec 2023 12:50:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 82E92385E008 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 82E92385E008 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702990222; cv=none; b=kdut6V4brm+wSizhJIf1QsGXH+LMLaWF7w97FVLvkLKqgDh17Yl8lAC4mE+3YFwM10TejcroYZqfYVL1LUg2dW1uO1mZuWinLq6U7OlhOQuS+buRl6IO9rwwV9+uNAafRocrUQxGquphJsEn9NThcXlXwMNNA1bJVBo6xuBN1Wc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702990222; c=relaxed/simple; bh=dikiY8OTWvP4yubWDBpWKRlqfQxIChZEl/AlhLWoF4E=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=d6Ua+0BuoJk4TOvgXily4jZd9+eHEYVyVAqeGss3LztZJ5dn+dAEVyWXVljWebdlRbeVsn3/kTYjJ6kxE3bbuvKUg8LEeH6BghvQ/uCSRirq6wg5MJ9il6b4MhlFRWXegeOp75zpWScK2iFmS4XzlQx4ZskMh4KwOZIBcDT1vyc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12c.google.com with SMTP id 2adb3069b0e04-50c04ebe1bbso5457944e87.1 for ; Tue, 19 Dec 2023 04:50:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702990219; x=1703595019; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=pDQ6bTXrEJXObye+rR5fD0NGwsoXjK6qJAA3ZMTrtY4=; b=eucD9Bl4Yj6EsrfpoteqH4Nctj9ahSz/NEkUJMC00CfAV3qfiQ33QWl/etXjHaPHjd qEqFJV3nM9cY0qJILUJgn2avJxAu4CGndD/mj/CmfEBVmegFih6YsN6nSxPCUf7kAJ8j awe2FIDvrpC+MxXJ9qjL5j9Hn7yJOi7NIac2earjZd6GvMMX7uS4JntdgQ2z9MG8+dvp VRu9CVAQGsRYiHTvU6fnfazYGXO5DqZkgo4Nxg4lMJwPnK0pPW9Fa4QlysLqJIWduuPL YyBRiScC4skNz/ArhUOhPsOScOTg6iMYln83/1NbGfOaoKJtvR/LxETOrrUoLXYIZvsy d5lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702990219; x=1703595019; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pDQ6bTXrEJXObye+rR5fD0NGwsoXjK6qJAA3ZMTrtY4=; b=WkmaWZ8M7wPnqSyOI0TIhEeA34d3Q4iT5Kqte2NVMOEx1du4lgF0o+ENMcHYAUAN76 L+XsWbJWyNTKqngoVs7nINRG6n8ucyMNbXi3gV6MfdRYdRPY0EyrnklAsZHRqcEYWI2P Fi1SrwmbLSYAFJz7XIMxkYYjRnBvf7TSjNKnu2kSYA7jF7q4wwmBOn+JI1IprUUeJdIq DoEVq8W3Ek6tFBu0WUfBr9zq4Q1fSvuXzP8nwUCuZXGY8ebCvyue+o0t7n0E3Yug8HZG PYqDKukVMC16WmA1M0SYEP69pQStr1OwQi7gzdYi/I3jLgV7TB399fWJ4d8/Va/H7p05 peGw== X-Gm-Message-State: AOJu0YzFhakaBOfcG9p9s9r6P6MKy4a//n6WEytgoEojsliq1aBME3Ur r3QBid682EfC4xWNQrx1SkvRTqiA8LWy1SoMtoo= X-Google-Smtp-Source: AGHT+IG7JXJvewLcWB/8d2O0QRVilJHWGqqt+lHH3aFkI9mrjVHHfe4ZMicz6nC6YoCdZR+VIiCZRbT8s25M29iQpzg= X-Received: by 2002:a05:6512:1114:b0:50e:4a5d:2bce with SMTP id l20-20020a056512111400b0050e4a5d2bcemr679223lfg.12.1702990218504; Tue, 19 Dec 2023 04:50:18 -0800 (PST) MIME-Version: 1.0 References: <20231219053853.3764283-1-hongtao.liu@intel.com> In-Reply-To: <20231219053853.3764283-1-hongtao.liu@intel.com> From: Richard Biener Date: Tue, 19 Dec 2023 13:48:54 +0100 Message-ID: Subject: Re: [PATCH] Optimize A < B ? A : B to MIN_EXPR. To: liuhongt Cc: gcc-patches@gcc.gnu.org, crazylht@gmail.com, hjl.tools@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Dec 19, 2023 at 6:39=E2=80=AFAM liuhongt wr= ote: > > Similar for A < B ? B : A to MAX_EXPR. > There're codes in the frontend to optimize such pattern but failed to > handle testcase in the PR since it's exposed at gimple level when > folding backend builtins. > > pr95906 now can be optimized to MAX_EXPR as it's commented in the > testcase. > > // FIXME: this should further optimize to a MAX_EXPR > typedef signed char v16i8 __attribute__((vector_size(16))); > v16i8 f(v16i8 a, v16i8 b) > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? (or maybe wait for GCC 15). I wonder if you can amend the existing patterns instead by iterating over cond/vec_cond. There are quite some (look for uses of minmax_from_comparison) that could be adapted to vectors. The ones matching the simple form you match are #if GIMPLE /* A >=3D B ? A : B -> max (A, B) and friends. The code is still in fold_cond_expr_with_comparison for GENERIC folding with some extra constraints. */ (for cmp (eq ne le lt unle unlt ge gt unge ungt uneq ltgt) (simplify (cond (cmp:c (nop_convert1?@c0 @0) (nop_convert2?@c1 @1)) (convert3? @0) (convert4? @1)) (if (!HONOR_SIGNED_ZEROS (type) ... I think. Consider at least placing the new patterns next to that. > gcc/ChangeLog: > > PR target/104401 > * match.pd (A < B ? A : B -> MIN_EXPR): New patten match. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr104401.c: New test. > * gcc.dg/tree-ssa/pr95906.c: Adjust testcase. > --- > gcc/match.pd | 20 ++++++++++++++++++ > gcc/testsuite/gcc.dg/tree-ssa/pr95906.c | 3 +-- > gcc/testsuite/gcc.target/i386/pr104401.c | 27 ++++++++++++++++++++++++ > 3 files changed, 48 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr104401.c > > diff --git a/gcc/match.pd b/gcc/match.pd > index d57e29bfe1d..9584a70aa3d 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -5263,6 +5263,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (view_convert:type > (vec_cond @4 (view_convert:vtype @2) (view_convert:vtype @3))))))= ) > > +/* Optimize A < B ? A : B to MIN (A, B) > + A > B ? A : B to MAX (A, B). */ > +(for cmp (lt le gt ge) > + minmax (min min max max) > + MINMAX (MIN_EXPR MIN_EXPR MAX_EXPR MAX_EXPR) > + (simplify > + (vec_cond (cmp @0 @1) @0 @1) > + (if (VECTOR_INTEGER_TYPE_P (type) > + && target_supports_op_p (type, MINMAX, optab_vector)) > + (minmax @0 @1)))) > + > +(for cmp (lt le gt ge) > + minmax (max max min min) > + MINMAX (MAX_EXPR MAX_EXPR MIN_EXPR MIN_EXPR) > + (simplify > + (vec_cond (cmp @0 @1) @1 @0) > + (if (VECTOR_INTEGER_TYPE_P (type) > + && target_supports_op_p (type, MINMAX, optab_vector)) > + (minmax @0 @1)))) > + > /* c1 ? c2 ? a : b : b --> (c1 & c2) ? a : b */ > (simplify > (vec_cond @0 (vec_cond:s @1 @2 @3) @3) > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c b/gcc/testsuite/gcc.= dg/tree-ssa/pr95906.c > index 3d820a58e93..d15670f3e9e 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c > @@ -1,7 +1,6 @@ > /* { dg-do compile } */ > /* { dg-options "-O2 -fdump-tree-forwprop3-raw -w -Wno-psabi" } */ > > -// FIXME: this should further optimize to a MAX_EXPR > typedef signed char v16i8 __attribute__((vector_size(16))); > v16i8 f(v16i8 a, v16i8 b) > { > @@ -10,4 +9,4 @@ v16i8 f(v16i8 a, v16i8 b) > } > > /* { dg-final { scan-tree-dump-not "bit_(and|ior)_expr" "forwprop3" } } = */ > -/* { dg-final { scan-tree-dump-times "vec_cond_expr" 1 "forwprop3" } } *= / > +/* { dg-final { scan-tree-dump-times "max_expr" 1 "forwprop3" } } */ > diff --git a/gcc/testsuite/gcc.target/i386/pr104401.c b/gcc/testsuite/gcc= .target/i386/pr104401.c > new file mode 100644 > index 00000000000..8ce7ff88d9e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr104401.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -msse4.1" } */ > +/* { dg-final { scan-assembler-times "pminsd" 2 } } */ > +/* { dg-final { scan-assembler-times "pmaxsd" 2 } } */ > + > +#include > + > +__m128i min32(__m128i value, __m128i input) > +{ > + return _mm_blendv_epi8(input, value, _mm_cmplt_epi32(value, input)); > +} > + > +__m128i max32(__m128i value, __m128i input) > +{ > + return _mm_blendv_epi8(input, value, _mm_cmpgt_epi32(value, input)); > +} > + > +__m128i min32_1(__m128i value, __m128i input) > +{ > + return _mm_blendv_epi8(input, value, _mm_cmpgt_epi32(input, value)); > +} > + > +__m128i max32_1(__m128i value, __m128i input) > +{ > + return _mm_blendv_epi8(input, value, _mm_cmplt_epi32(input, value)); > +} > + > -- > 2.31.1 >