From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) by sourceware.org (Postfix) with ESMTPS id 1E01E3857349 for ; Wed, 10 Jan 2024 12:06:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1E01E3857349 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1E01E3857349 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::235 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704888377; cv=none; b=uA8v/xHBO/PoBuaOh2tF0pTHhYnxe/YF6A3jBGAVwPVKyqTsBiQi9zCr96syMqbp3aT/YG0CdjGGGk7YR+6C8tIjUlyHblVOy6vPzH12AaeI4PJ28IWvqvTA95opmEFlO+QPhBDhXmZG1kwvevbYHJDt8joG8VLHwTz7ylZM7SE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704888377; c=relaxed/simple; bh=zxs3MIeGU8/sOxlvo/l5jqN9doh4VS5PPmACjZv5ASg=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=KZe9vSJzCgvZRnn+y/I7GLFymc1kHHipBhtxmp3TRfc1rXKn68VWglPxeGoKSkPfsEDP264W0HivRz88eGtQnWfJWb5fuOVZJPKcqBp/NZtvRVMYedHHGhEPGMKDzSGgj+pr7s+e7FZk02T4bhQN6TjQbVdBxv7eLDtyWqTk5ys= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lj1-x235.google.com with SMTP id 38308e7fff4ca-2cd33336b32so53236641fa.0 for ; Wed, 10 Jan 2024 04:06:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704888373; x=1705493173; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=pgDrI8cGxYqMcgn614UNz8PwaMR0vWsSZy1RkxnDu/4=; b=E4nbhZ+QFltyRuK2/+ICyUsVfL7Y+HN7dYfy5xP+n67OETBJjfK0dN3A4tzsoHvhtN 1RLKCj7R9t1Jn7qHccisrmdm962VpWuMBX2pdMUx7b8eJAUktg4rm7Fg3LiiWjAvqZni Y8glsxYzgoP26rh/83DCuR1/mpP8WNIfqKJKMeHO9/2Ib1Lq7oW1Wwn2Zo8CKQL+BD6j dwefVgXTlO93Y9hJqENlMNJKV9fCinqDA4HK2xosgyecXfv2s12O67yeFwhmSS8ZqFIi TU+Suc2H99sX5QHCq4E5nOz6CklKHdbzax2bbtDZJjpoUJ1yS8BuWSegXIhZjVnPSNth U7IQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704888373; x=1705493173; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pgDrI8cGxYqMcgn614UNz8PwaMR0vWsSZy1RkxnDu/4=; b=YQ2ujrR4hQlzR7/3veTsthHlHhDy6g6KmbE3+s4h1l8MZsoguVp3QvQeJ6QLguZy5w WmUI8byyk3gjxxKdq/kmFOubxN4ExA5eulUzDvW7uKUbOnjGwSM5GIEjR/DoXzKigZ83 59Fqaqh2OME0SqJjSd6zqrHtU58GX9x/RIvioQDtnlFDnbhTnw6ZcxIwnVQpdoOanqzj jnmIDqk0HYPIpRfPHLDJYzEOEKaVNe6BmHu/S5MEKtktdPRYS0TNdqkFMgMt1iVvJujl SRb2XSICpjG9v/lCtjFckDgHbYEM72D1Qt+51RS0hxJNl7ekfnjYgDlaUEvPbS2bKDDr 6dkA== X-Gm-Message-State: AOJu0YxSUeTgIEaZ9xgObrmuwAb7y1Atmi6OQBM1oEAObIfMJk1j3ahl PwU/0aabS8xszKFfhN9qkrOaWt8TFmGVztu5glvXeoDZ X-Google-Smtp-Source: AGHT+IEz7FW5NpxnFRofBMfcpgM9NHEbWNoJLarvMmpYUXfhi6iP/g3o+356pGu/TAxwj50sPUqvVX/7nSTHequWY1g= X-Received: by 2002:a2e:934b:0:b0:2cc:788a:3d4d with SMTP id m11-20020a2e934b000000b002cc788a3d4dmr487883ljh.51.1704888372606; Wed, 10 Jan 2024 04:06:12 -0800 (PST) MIME-Version: 1.0 References: <20240109104648.675293-1-hongtao.liu@intel.com> In-Reply-To: <20240109104648.675293-1-hongtao.liu@intel.com> From: Richard Biener Date: Wed, 10 Jan 2024 13:01:00 +0100 Message-ID: Subject: Re: [PATCH] Optimize A < B ? A : B to MIN_EXPR. To: liuhongt Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Jan 9, 2024 at 11:48=E2=80=AFAM liuhongt wr= ote: > > > I wonder if you can amend the existing patterns instead by iterating > > over cond/vec_cond. There are quite some (look for uses of > > minmax_from_comparison) that could be adapted to vectors. > > > > The ones matching the simple form you match are > > > > #if GIMPLE > > /* A >=3D B ? A : B -> max (A, B) and friends. The code is still > > in fold_cond_expr_with_comparison for GENERIC folding with > > some extra constraints. */ > > (for cmp (eq ne le lt unle unlt ge gt unge ungt uneq ltgt) > > (simplify > > (cond (cmp:c (nop_convert1?@c0 @0) (nop_convert2?@c1 @1)) > > (convert3? @0) (convert4? @1)) > > (if (!HONOR_SIGNED_ZEROS (type) > > ... > This pattern is a conditional operation that treats a vector as a complet= e > unit, it's more like cbranchm which is different from vec_cond_expr. > So I add my patterns after this. > > > > I think. Consider at least placing the new patterns next to that. > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? OK. Richard. > Similar for A < B ? B : A to MAX_EXPR. > There're codes in the frontend to optimize such pattern but failed to > handle testcase in the PR since it's exposed at gimple level when > folding backend builtins. > > pr95906 now can be optimized to MAX_EXPR as it's commented in the > testcase. > > // FIXME: this should further optimize to a MAX_EXPR > typedef signed char v16i8 __attribute__((vector_size(16))); > v16i8 f(v16i8 a, v16i8 b) > > gcc/ChangeLog: > > PR target/104401 > * match.pd (VEC_COND_EXPR: A < B ? A : B -> MIN_EXPR): New patten= match. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr104401.c: New test. > * gcc.dg/tree-ssa/pr95906.c: Adjust testcase. > --- > gcc/match.pd | 21 ++++++++++++++++++ > gcc/testsuite/gcc.dg/tree-ssa/pr95906.c | 3 +-- > gcc/testsuite/gcc.target/i386/pr104401.c | 27 ++++++++++++++++++++++++ > 3 files changed, 49 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr104401.c > > diff --git a/gcc/match.pd b/gcc/match.pd > index 7b4b15acc41..d8e2009a83f 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -5672,6 +5672,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (if (VECTOR_TYPE_P (type)) > (view_convert @c0) > (convert @c0)))))))) > + > +/* This is for VEC_COND_EXPR > + Optimize A < B ? A : B to MIN (A, B) > + A > B ? A : B to MAX (A, B). */ > +(for cmp (lt le ungt unge gt ge unlt unle) > + minmax (min min min min max max max max) > + MINMAX (MIN_EXPR MIN_EXPR MIN_EXPR MIN_EXPR MAX_EXPR MAX_EXPR MAX_E= XPR MAX_EXPR) > + (simplify > + (vec_cond (cmp @0 @1) @0 @1) > + (if (VECTOR_INTEGER_TYPE_P (type) > + && target_supports_op_p (type, MINMAX, optab_vector)) > + (minmax @0 @1)))) > + > +(for cmp (lt le ungt unge gt ge unlt unle) > + minmax (max max max max min min min min) > + MINMAX (MAX_EXPR MAX_EXPR MAX_EXPR MAX_EXPR MIN_EXPR MIN_EXPR MIN_E= XPR MIN_EXPR) > + (simplify > + (vec_cond (cmp @0 @1) @1 @0) > + (if (VECTOR_INTEGER_TYPE_P (type) > + && target_supports_op_p (type, MINMAX, optab_vector)) > + (minmax @0 @1)))) > #endif > > (for cnd (cond vec_cond) > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c b/gcc/testsuite/gcc.= dg/tree-ssa/pr95906.c > index 3d820a58e93..d15670f3e9e 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c > @@ -1,7 +1,6 @@ > /* { dg-do compile } */ > /* { dg-options "-O2 -fdump-tree-forwprop3-raw -w -Wno-psabi" } */ > > -// FIXME: this should further optimize to a MAX_EXPR > typedef signed char v16i8 __attribute__((vector_size(16))); > v16i8 f(v16i8 a, v16i8 b) > { > @@ -10,4 +9,4 @@ v16i8 f(v16i8 a, v16i8 b) > } > > /* { dg-final { scan-tree-dump-not "bit_(and|ior)_expr" "forwprop3" } } = */ > -/* { dg-final { scan-tree-dump-times "vec_cond_expr" 1 "forwprop3" } } *= / > +/* { dg-final { scan-tree-dump-times "max_expr" 1 "forwprop3" } } */ > diff --git a/gcc/testsuite/gcc.target/i386/pr104401.c b/gcc/testsuite/gcc= .target/i386/pr104401.c > new file mode 100644 > index 00000000000..8ce7ff88d9e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr104401.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -msse4.1" } */ > +/* { dg-final { scan-assembler-times "pminsd" 2 } } */ > +/* { dg-final { scan-assembler-times "pmaxsd" 2 } } */ > + > +#include > + > +__m128i min32(__m128i value, __m128i input) > +{ > + return _mm_blendv_epi8(input, value, _mm_cmplt_epi32(value, input)); > +} > + > +__m128i max32(__m128i value, __m128i input) > +{ > + return _mm_blendv_epi8(input, value, _mm_cmpgt_epi32(value, input)); > +} > + > +__m128i min32_1(__m128i value, __m128i input) > +{ > + return _mm_blendv_epi8(input, value, _mm_cmpgt_epi32(input, value)); > +} > + > +__m128i max32_1(__m128i value, __m128i input) > +{ > + return _mm_blendv_epi8(input, value, _mm_cmplt_epi32(input, value)); > +} > + > -- > 2.31.1 >