From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 34508 invoked by alias); 12 Nov 2019 09:25:47 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 34489 invoked by uid 89); 12 Nov 2019 09:25:46 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-7.2 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=Until X-HELO: mail-lf1-f47.google.com Received: from mail-lf1-f47.google.com (HELO mail-lf1-f47.google.com) (209.85.167.47) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 12 Nov 2019 09:25:44 +0000 Received: by mail-lf1-f47.google.com with SMTP id v8so12194711lfa.12 for ; Tue, 12 Nov 2019 01:25:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=sGTgYMwQU82Y9yi7qqtuQNoYecZ0zw8UclKPmW2pSeQ=; b=ncaRMxo2hoWFTMbZes6vTUW9WbZdLJBDQyq4AqiWUUw9acjH0+acwfPg8MDgVFQ/CU pB9ag0WB+PZLjFI5WEHItvxSmd3cAgrGhjpJz7ZbRNz+UF9N7r31F3N2qTBApviw5YuK F+23/4wH2M8kXvvwdTVjaaHC+30XnUtRFjslGl8aIj6fNhRBScnXVaxlmmKiJ3Qgq4cU N15LPtHu9yHv0l0i4NCFJ81zhAHpCxcMoXGc0kqaMQpp+FGU33uoXMbY/dEdsrkFSkJo M1lSJf4xEdRIPGGEPbV8xwb4GvbQzFiHsz/Iv74AxPVlpJw+174A/wF/uHGKqfPImqeX o/EA== MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Tue, 12 Nov 2019 09:40:00 -0000 Message-ID: Subject: Re: [14/n] Vectorise conversions between differently-sized integer vectors To: Richard Biener , GCC Patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2019-11/txt/msg00856.txt.bz2 On Wed, Nov 6, 2019 at 1:45 PM Richard Sandiford wrote: > > Richard Biener writes: > > On Fri, Oct 25, 2019 at 2:51 PM Richard Sandiford > > wrote: > >> > >> This patch adds AArch64 patterns for converting between 64-bit and > >> 128-bit integer vectors, and makes the vectoriser and expand pass > >> use them. > > > > So on GIMPLE we'll see > > > > v4si _1; > > v4di _2; > > > > _1 = (v4si) _2; > > > > then, correct? Likewise for float conversions. > > > > I think that's "new", can you add to tree-cfg.c:verify_gimple_assign_unary > > verification that the number of lanes of the LHS and the RHS match please? > > Ah, yeah. How's this? Tested as before. OK. Thanks, Richard. > Richard > > > 2019-11-06 Richard Sandiford > > gcc/ > * tree-cfg.c (verify_gimple_assign_unary): Handle conversions > between vector types. > * tree-vect-stmts.c (vectorizable_conversion): Extend the > non-widening and non-narrowing path to handle standard > conversion codes, if the target supports them. > * expr.c (convert_move): Try using the extend and truncate optabs > for vectors. > * optabs-tree.c (supportable_convert_operation): Likewise. > * config/aarch64/iterators.md (Vnarroqw): New iterator. > * config/aarch64/aarch64-simd.md (2) > (trunc2): New patterns. > > gcc/testsuite/ > * gcc.dg/vect/bb-slp-pr69907.c: Do not expect BB vectorization > to fail for aarch64 targets. > * gcc.dg/vect/no-scevccp-outer-12.c: Expect the test to pass > on aarch64 targets. > * gcc.dg/vect/vect-double-reduc-5.c: Likewise. > * gcc.dg/vect/vect-outer-4e.c: Likewise. > * gcc.target/aarch64/vect_mixed_sizes_5.c: New test. > * gcc.target/aarch64/vect_mixed_sizes_6.c: Likewise. > * gcc.target/aarch64/vect_mixed_sizes_7.c: Likewise. > * gcc.target/aarch64/vect_mixed_sizes_8.c: Likewise. > * gcc.target/aarch64/vect_mixed_sizes_9.c: Likewise. > * gcc.target/aarch64/vect_mixed_sizes_10.c: Likewise. > * gcc.target/aarch64/vect_mixed_sizes_11.c: Likewise. > * gcc.target/aarch64/vect_mixed_sizes_12.c: Likewise. > * gcc.target/aarch64/vect_mixed_sizes_13.c: Likewise. > > Index: gcc/tree-cfg.c > =================================================================== > --- gcc/tree-cfg.c 2019-09-05 08:49:30.829739618 +0100 > +++ gcc/tree-cfg.c 2019-11-06 12:44:22.832365429 +0000 > @@ -3553,6 +3553,24 @@ verify_gimple_assign_unary (gassign *stm > { > CASE_CONVERT: > { > + /* Allow conversions between vectors with the same number of elements, > + provided that the conversion is OK for the element types too. */ > + if (VECTOR_TYPE_P (lhs_type) > + && VECTOR_TYPE_P (rhs1_type) > + && known_eq (TYPE_VECTOR_SUBPARTS (lhs_type), > + TYPE_VECTOR_SUBPARTS (rhs1_type))) > + { > + lhs_type = TREE_TYPE (lhs_type); > + rhs1_type = TREE_TYPE (rhs1_type); > + } > + else if (VECTOR_TYPE_P (lhs_type) || VECTOR_TYPE_P (rhs1_type)) > + { > + error ("invalid vector types in nop conversion"); > + debug_generic_expr (lhs_type); > + debug_generic_expr (rhs1_type); > + return true; > + } > + > /* Allow conversions from pointer type to integral type only if > there is no sign or zero extension involved. > For targets were the precision of ptrofftype doesn't match that > Index: gcc/tree-vect-stmts.c > =================================================================== > --- gcc/tree-vect-stmts.c 2019-11-06 12:44:10.896448608 +0000 > +++ gcc/tree-vect-stmts.c 2019-11-06 12:44:22.832365429 +0000 > @@ -4869,7 +4869,9 @@ vectorizable_conversion (stmt_vec_info s > switch (modifier) > { > case NONE: > - if (code != FIX_TRUNC_EXPR && code != FLOAT_EXPR) > + if (code != FIX_TRUNC_EXPR > + && code != FLOAT_EXPR > + && !CONVERT_EXPR_CODE_P (code)) > return false; > if (supportable_convert_operation (code, vectype_out, vectype_in, > &decl1, &code1)) > Index: gcc/expr.c > =================================================================== > --- gcc/expr.c 2019-11-06 12:29:17.394677341 +0000 > +++ gcc/expr.c 2019-11-06 12:44:22.828365457 +0000 > @@ -250,6 +250,31 @@ convert_move (rtx to, rtx from, int unsi > > if (VECTOR_MODE_P (to_mode) || VECTOR_MODE_P (from_mode)) > { > + if (GET_MODE_UNIT_PRECISION (to_mode) > + > GET_MODE_UNIT_PRECISION (from_mode)) > + { > + optab op = unsignedp ? zext_optab : sext_optab; > + insn_code icode = convert_optab_handler (op, to_mode, from_mode); > + if (icode != CODE_FOR_nothing) > + { > + emit_unop_insn (icode, to, from, > + unsignedp ? ZERO_EXTEND : SIGN_EXTEND); > + return; > + } > + } > + > + if (GET_MODE_UNIT_PRECISION (to_mode) > + < GET_MODE_UNIT_PRECISION (from_mode)) > + { > + insn_code icode = convert_optab_handler (trunc_optab, > + to_mode, from_mode); > + if (icode != CODE_FOR_nothing) > + { > + emit_unop_insn (icode, to, from, TRUNCATE); > + return; > + } > + } > + > gcc_assert (known_eq (GET_MODE_BITSIZE (from_mode), > GET_MODE_BITSIZE (to_mode))); > > Index: gcc/optabs-tree.c > =================================================================== > --- gcc/optabs-tree.c 2019-11-06 12:28:23.000000000 +0000 > +++ gcc/optabs-tree.c 2019-11-06 12:44:22.828365457 +0000 > @@ -303,6 +303,20 @@ supportable_convert_operation (enum tree > return true; > } > > + if (GET_MODE_UNIT_PRECISION (m1) > GET_MODE_UNIT_PRECISION (m2) > + && can_extend_p (m1, m2, TYPE_UNSIGNED (vectype_in))) > + { > + *code1 = code; > + return true; > + } > + > + if (GET_MODE_UNIT_PRECISION (m1) < GET_MODE_UNIT_PRECISION (m2) > + && convert_optab_handler (trunc_optab, m1, m2) != CODE_FOR_nothing) > + { > + *code1 = code; > + return true; > + } > + > /* Now check for builtin. */ > if (targetm.vectorize.builtin_conversion > && targetm.vectorize.builtin_conversion (code, vectype_out, vectype_in)) > Index: gcc/config/aarch64/iterators.md > =================================================================== > --- gcc/config/aarch64/iterators.md 2019-11-06 12:28:23.000000000 +0000 > +++ gcc/config/aarch64/iterators.md 2019-11-06 12:44:22.824365485 +0000 > @@ -933,6 +933,8 @@ (define_mode_attr VNARROWQ [(V8HI "V8QI" > (V2DI "V2SI") > (DI "SI") (SI "HI") > (HI "QI")]) > +(define_mode_attr Vnarrowq [(V8HI "v8qi") (V4SI "v4hi") > + (V2DI "v2si")]) > > ;; Narrowed quad-modes for VQN (Used for XTN2). > (define_mode_attr VNARROWQ2 [(V8HI "V16QI") (V4SI "V8HI") > Index: gcc/config/aarch64/aarch64-simd.md > =================================================================== > --- gcc/config/aarch64/aarch64-simd.md 2019-11-06 12:28:23.000000000 +0000 > +++ gcc/config/aarch64/aarch64-simd.md 2019-11-06 12:44:22.824365485 +0000 > @@ -7007,3 +7007,21 @@ (define_insn "aarch64_crypto_pmullv2di" > "pmull2\\t%0.1q, %1.2d, %2.2d" > [(set_attr "type" "crypto_pmull")] > ) > + > +;; Sign- or zero-extend a 64-bit integer vector to a 128-bit vector. > +(define_insn "2" > + [(set (match_operand:VQN 0 "register_operand" "=w") > + (ANY_EXTEND:VQN (match_operand: 1 "register_operand" "w")))] > + "TARGET_SIMD" > + "xtl\t%0., %1." > + [(set_attr "type" "neon_shift_imm_long")] > +) > + > +;; Truncate a 128-bit integer vector to a 64-bit vector. > +(define_insn "trunc2" > + [(set (match_operand: 0 "register_operand" "=w") > + (truncate: (match_operand:VQN 1 "register_operand" "w")))] > + "TARGET_SIMD" > + "xtn\t%0., %1." > + [(set_attr "type" "neon_shift_imm_narrow_q")] > +) > Index: gcc/testsuite/gcc.dg/vect/bb-slp-pr69907.c > =================================================================== > --- gcc/testsuite/gcc.dg/vect/bb-slp-pr69907.c 2019-03-08 18:15:02.292871138 +0000 > +++ gcc/testsuite/gcc.dg/vect/bb-slp-pr69907.c 2019-11-06 12:44:22.828365457 +0000 > @@ -18,5 +18,6 @@ void foo(unsigned *p1, unsigned short *p > } > > /* Disable for SVE because for long or variable-length vectors we don't > - get an unrolled epilogue loop. */ > -/* { dg-final { scan-tree-dump "BB vectorization with gaps at the end of a load is not supported" "slp1" { target { ! aarch64_sve } } } } */ > + get an unrolled epilogue loop. Also disable for AArch64 Advanced SIMD, > + because there we can vectorize the epilogue using mixed vector sizes. */ > +/* { dg-final { scan-tree-dump "BB vectorization with gaps at the end of a load is not supported" "slp1" { target { ! aarch64*-*-* } } } } */ > Index: gcc/testsuite/gcc.dg/vect/no-scevccp-outer-12.c > =================================================================== > --- gcc/testsuite/gcc.dg/vect/no-scevccp-outer-12.c 2019-11-06 12:28:23.000000000 +0000 > +++ gcc/testsuite/gcc.dg/vect/no-scevccp-outer-12.c 2019-11-06 12:44:22.828365457 +0000 > @@ -46,4 +46,4 @@ int main (void) > } > > /* Until we support multiple types in the inner loop */ > -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail *-*-* } } } */ > +/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { ! aarch64*-*-* } } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c > =================================================================== > --- gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c 2019-11-06 12:28:23.000000000 +0000 > +++ gcc/testsuite/gcc.dg/vect/vect-double-reduc-5.c 2019-11-06 12:44:22.828365457 +0000 > @@ -52,5 +52,5 @@ int main () > > /* Vectorization of loops with multiple types and double reduction is not > supported yet. */ > -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */ > +/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail { ! aarch64*-*-* } } } } */ > > Index: gcc/testsuite/gcc.dg/vect/vect-outer-4e.c > =================================================================== > --- gcc/testsuite/gcc.dg/vect/vect-outer-4e.c 2019-11-06 12:28:23.000000000 +0000 > +++ gcc/testsuite/gcc.dg/vect/vect-outer-4e.c 2019-11-06 12:44:22.828365457 +0000 > @@ -23,4 +23,4 @@ foo (){ > return; > } > > -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */ > +/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail { ! aarch64*-*-* } } } } */ > Index: gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_5.c > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_5.c 2019-11-06 12:44:22.828365457 +0000 > @@ -0,0 +1,18 @@ > +/* { dg-options "-O2 -ftree-vectorize" } */ > + > +#pragma GCC target "+nosve" > + > +#include > + > +void > +f (int64_t *x, int64_t *y, int32_t *z, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + x[i] = z[i]; > + y[i] += y[i - 2]; > + } > +} > + > +/* { dg-final { scan-assembler-times {\tsxtl\tv[0-9]+\.2d, v[0-9]+\.2s\n} 1 } } */ > +/* { dg-final { scan-assembler-times {\tadd\tv[0-9]+\.2d,} 1 } } */ > Index: gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_6.c > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_6.c 2019-11-06 12:44:22.828365457 +0000 > @@ -0,0 +1,18 @@ > +/* { dg-options "-O2 -ftree-vectorize" } */ > + > +#pragma GCC target "+nosve" > + > +#include > + > +void > +f (int32_t *x, int32_t *y, int16_t *z, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + x[i] = z[i]; > + y[i] += y[i - 4]; > + } > +} > + > +/* { dg-final { scan-assembler-times {\tsxtl\tv[0-9]+\.4s, v[0-9]+\.4h\n} 1 } } */ > +/* { dg-final { scan-assembler-times {\tadd\tv[0-9]+\.4s,} 1 } } */ > Index: gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_7.c > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_7.c 2019-11-06 12:44:22.828365457 +0000 > @@ -0,0 +1,18 @@ > +/* { dg-options "-O2 -ftree-vectorize" } */ > + > +#pragma GCC target "+nosve" > + > +#include > + > +void > +f (int16_t *x, int16_t *y, int8_t *z, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + x[i] = z[i]; > + y[i] += y[i - 8]; > + } > +} > + > +/* { dg-final { scan-assembler-times {\tsxtl\tv[0-9]+\.8h, v[0-9]+\.8b\n} 1 } } */ > +/* { dg-final { scan-assembler-times {\tadd\tv[0-9]+\.8h,} 1 } } */ > Index: gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_8.c > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_8.c 2019-11-06 12:44:22.828365457 +0000 > @@ -0,0 +1,18 @@ > +/* { dg-options "-O2 -ftree-vectorize" } */ > + > +#pragma GCC target "+nosve" > + > +#include > + > +void > +f (int64_t *x, int64_t *y, uint32_t *z, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + x[i] = z[i]; > + y[i] += y[i - 2]; > + } > +} > + > +/* { dg-final { scan-assembler-times {\tuxtl\tv[0-9]+\.2d, v[0-9]+\.2s\n} 1 } } */ > +/* { dg-final { scan-assembler-times {\tadd\tv[0-9]+\.2d,} 1 } } */ > Index: gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_9.c > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_9.c 2019-11-06 12:44:22.828365457 +0000 > @@ -0,0 +1,18 @@ > +/* { dg-options "-O2 -ftree-vectorize" } */ > + > +#pragma GCC target "+nosve" > + > +#include > + > +void > +f (int32_t *x, int32_t *y, uint16_t *z, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + x[i] = z[i]; > + y[i] += y[i - 4]; > + } > +} > + > +/* { dg-final { scan-assembler-times {\tuxtl\tv[0-9]+\.4s, v[0-9]+\.4h\n} 1 } } */ > +/* { dg-final { scan-assembler-times {\tadd\tv[0-9]+\.4s,} 1 } } */ > Index: gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_10.c > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_10.c 2019-11-06 12:44:22.828365457 +0000 > @@ -0,0 +1,18 @@ > +/* { dg-options "-O2 -ftree-vectorize" } */ > + > +#pragma GCC target "+nosve" > + > +#include > + > +void > +f (int16_t *x, int16_t *y, uint8_t *z, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + x[i] = z[i]; > + y[i] += y[i - 8]; > + } > +} > + > +/* { dg-final { scan-assembler-times {\tuxtl\tv[0-9]+\.8h, v[0-9]+\.8b\n} 1 } } */ > +/* { dg-final { scan-assembler-times {\tadd\tv[0-9]+\.8h,} 1 } } */ > Index: gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_11.c > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_11.c 2019-11-06 12:44:22.828365457 +0000 > @@ -0,0 +1,18 @@ > +/* { dg-options "-O2 -ftree-vectorize" } */ > + > +#pragma GCC target "+nosve" > + > +#include > + > +void > +f (int32_t *x, int64_t *y, int64_t *z, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + x[i] = z[i]; > + y[i] += y[i - 2]; > + } > +} > + > +/* { dg-final { scan-assembler-times {\txtn\tv[0-9]+\.2s, v[0-9]+\.2d\n} 1 } } */ > +/* { dg-final { scan-assembler-times {\tadd\tv[0-9]+\.2d,} 1 } } */ > Index: gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_12.c > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_12.c 2019-11-06 12:44:22.828365457 +0000 > @@ -0,0 +1,18 @@ > +/* { dg-options "-O2 -ftree-vectorize" } */ > + > +#pragma GCC target "+nosve" > + > +#include > + > +void > +f (int16_t *x, int32_t *y, int32_t *z, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + x[i] = z[i]; > + y[i] += y[i - 4]; > + } > +} > + > +/* { dg-final { scan-assembler-times {\txtn\tv[0-9]+\.4h, v[0-9]+\.4s\n} 1 } } */ > +/* { dg-final { scan-assembler-times {\tadd\tv[0-9]+\.4s,} 1 } } */ > Index: gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_13.c > =================================================================== > --- /dev/null 2019-09-17 11:41:18.176664108 +0100 > +++ gcc/testsuite/gcc.target/aarch64/vect_mixed_sizes_13.c 2019-11-06 12:44:22.828365457 +0000 > @@ -0,0 +1,18 @@ > +/* { dg-options "-O2 -ftree-vectorize" } */ > + > +#pragma GCC target "+nosve" > + > +#include > + > +void > +f (int8_t *x, int16_t *y, int16_t *z, int n) > +{ > + for (int i = 0; i < n; ++i) > + { > + x[i] = z[i]; > + y[i] += y[i - 8]; > + } > +} > + > +/* { dg-final { scan-assembler-times {\txtn\tv[0-9]+\.8b, v[0-9]+\.8h\n} 1 } } */ > +/* { dg-final { scan-assembler-times {\tadd\tv[0-9]+\.8h,} 1 } } */