From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2063) id 2C45A385417C; Wed, 24 Aug 2022 02:31:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2C45A385417C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1661308306; bh=YZFGU5/w37XJj0MKbK8VBLlb5ocfyArcrc3pWBmUk0U=; h=From:To:Subject:Date:From; b=mZ77HEjvIz2aZe+FyRWWt+6GXM0tibI0SVGbEozIVwFULk14Tz16XERhiA9Fh2kQb TB245sAWddQtXNtCedRBOeurSA1RWbYyz4zQarxPnDmgvnfUilxgRItqoCQlhMYDO9 2M1b9rCPoXvXc9VHy8QCfK5At8sMg1becWate5fE= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Kewen Lin To: gcc-cvs@gcc.gnu.org Subject: [gcc r12-8710] vect: Don't allow vect_emulated_vector_p type in vectorizable_call [PR106322] X-Act-Checkin: gcc X-Git-Author: Kewen Lin X-Git-Refname: refs/heads/releases/gcc-12 X-Git-Oldrev: d0d72e0b1ebbac487d70281a56799bf547034ec1 X-Git-Newrev: 9f532fec01d6651cc3cc136073f044a7953d8560 Message-Id: <20220824023146.2C45A385417C@sourceware.org> Date: Wed, 24 Aug 2022 02:31:46 +0000 (GMT) List-Id: https://gcc.gnu.org/g:9f532fec01d6651cc3cc136073f044a7953d8560 commit r12-8710-g9f532fec01d6651cc3cc136073f044a7953d8560 Author: Kewen Lin Date: Tue Aug 16 00:18:51 2022 -0500 vect: Don't allow vect_emulated_vector_p type in vectorizable_call [PR106322] As PR106322 shows, in some cases for some vector type whose TYPE_MODE is a scalar integral mode instead of a vector mode, it's possible to obtain wrong target support information when querying with the scalar integral mode. For example, for the test case in PR106322, on ppc64 32bit vectorizer gets vector type "vector(2) short unsigned int" for scalar type "short unsigned int", its mode is SImode instead of V2HImode. The target support querying checks umul_highpart optab with SImode and considers it's supported, then vectorizer further generates .MULH IFN call for that vector type. Unfortunately it's wrong to use SImode support for that vector type multiply highpart here. This patch is to teach vectorizable_call analysis not to allow vect_emulated_vector_p type for both vectype_in and vectype_out as Richi suggested. PR tree-optimization/106322 gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_call): Don't allow vect_emulated_vector_p type for both vectype_in and vectype_out. gcc/testsuite/ChangeLog: * gcc.target/i386/pr106322.c: New test. * gcc.target/powerpc/pr106322.c: New test. (cherry picked from commit 5239e2bd48fb1e6a1d1b06a1bac49bee0a742e98) Diff: --- gcc/testsuite/gcc.target/i386/pr106322.c | 51 +++++++++++++++++++++++++++++ gcc/testsuite/gcc.target/powerpc/pr106322.c | 50 ++++++++++++++++++++++++++++ gcc/tree-vect-stmts.cc | 8 +++++ 3 files changed, 109 insertions(+) diff --git a/gcc/testsuite/gcc.target/i386/pr106322.c b/gcc/testsuite/gcc.target/i386/pr106322.c new file mode 100644 index 00000000000..31333c5fdcc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr106322.c @@ -0,0 +1,51 @@ +/* { dg-do run } */ +/* { dg-require-effective-target ia32 } */ +/* { dg-options "-O2 -mtune=generic -march=i686" } */ + +/* As PR106322, verify this can execute well (not abort). */ + +#define N 64 +typedef unsigned short int uh; +typedef unsigned short int uw; +uh a[N]; +uh b[N]; +uh c[N]; +uh e[N]; + +__attribute__ ((noipa)) void +foo () +{ + for (int i = 0; i < N; i++) + c[i] = ((uw) b[i] * (uw) a[i]) >> 16; +} + +__attribute__ ((optimize ("-O0"))) void +init () +{ + for (int i = 0; i < N; i++) + { + a[i] = (uh) (0x7ABC - 0x5 * i); + b[i] = (uh) (0xEAB + 0xF * i); + e[i] = ((uw) b[i] * (uw) a[i]) >> 16; + } +} + +__attribute__ ((optimize ("-O0"))) void +check () +{ + for (int i = 0; i < N; i++) + { + if (c[i] != e[i]) + __builtin_abort (); + } +} + +int +main () +{ + init (); + foo (); + check (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/powerpc/pr106322.c b/gcc/testsuite/gcc.target/powerpc/pr106322.c new file mode 100644 index 00000000000..c05072d3416 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr106322.c @@ -0,0 +1,50 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mdejagnu-cpu=power4" } */ + +/* As PR106322, verify this can execute well (not abort). */ + +#define N 64 +typedef unsigned short int uh; +typedef unsigned short int uw; +uh a[N]; +uh b[N]; +uh c[N]; +uh e[N]; + +__attribute__ ((noipa)) void +foo () +{ + for (int i = 0; i < N; i++) + c[i] = ((uw) b[i] * (uw) a[i]) >> 16; +} + +__attribute__ ((optimize ("-O0"))) void +init () +{ + for (int i = 0; i < N; i++) + { + a[i] = (uh) (0x7ABC - 0x5 * i); + b[i] = (uh) (0xEAB + 0xF * i); + e[i] = ((uw) b[i] * (uw) a[i]) >> 16; + } +} + +__attribute__ ((optimize ("-O0"))) void +check () +{ + for (int i = 0; i < N; i++) + { + if (c[i] != e[i]) + __builtin_abort (); + } +} + +int +main () +{ + init (); + foo (); + check (); + + return 0; +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index d8da13e312a..4c5d20a0e2c 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -3419,6 +3419,14 @@ vectorizable_call (vec_info *vinfo, return false; } + if (vect_emulated_vector_p (vectype_in) || vect_emulated_vector_p (vectype_out)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, + "use emulated vector type for call\n"); + return false; + } + /* FORNOW */ nunits_in = TYPE_VECTOR_SUBPARTS (vectype_in); nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out);