From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x112b.google.com (mail-yw1-x112b.google.com [IPv6:2607:f8b0:4864:20::112b]) by sourceware.org (Postfix) with ESMTPS id 97C79385840D for ; Mon, 17 Apr 2023 18:37:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 97C79385840D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-yw1-x112b.google.com with SMTP id 00721157ae682-54ee0b73e08so430580637b3.0 for ; Mon, 17 Apr 2023 11:37:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1681756625; x=1684348625; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=V/VXsleDiAHLosN3szsaf7V8WtfFsIyW00/0RzZxydc=; b=CGpCJd3x+XG4YoKtMbllI0yfcesA9+KBaFfKaPDby1SfWMQpPDVIggokEgKToDC922 TZaIZ9+/X6cW9zDzcr+eydtlBHI/qTRZe9LOi5fBTWBs8KnhoqQTMbnkm4KEuv4jMh1Y BgDF5Dqag1ETMAVR3i9mYtqeLSQawwlYXqKtnwAzd2uKkZv5zU9P32E0DPOg+Mc0LWCH xtSMIuRE0s1chegzLxYa/nbDMUNCWPjAa9No7Ykj7+T24sJSv8JvAvta3PS2n8GRGRK6 xvzi6bQvE5T3QWtj4acqIvVIVqMvxuoxBDIztJC1PVkbkDZNltLAhbD1Pq2kGlWkSI+P oinQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681756625; x=1684348625; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V/VXsleDiAHLosN3szsaf7V8WtfFsIyW00/0RzZxydc=; b=lv+c3SiB4FFLIKIk3qa0DILd0LqoqgOrIjjyv+ilkgxacPUd+R2IwPC85xH5AorE0d aLrsRkv2AhA8TwiA8LGWQB7NNbeabf6uWOxIe6KRNGE42auXLhMqw4AcyF3PppEpy0u4 1TlylW/pnljDkEojJ/+zLbVdutJBcJsuV2lmBnW2ND1Tt6dWBqAaBaEcSlCZ+c/IaXG7 S3EMFl018UQ2faubnxVXCMl5O9fenokT+29hM+awfeuTBOTMZPxw47O8piS4xbjXyB9m caEIiJlbArw4vqvZUREOLbD8wU1/TJXbqb8nKrJpY5bEhHeo0HNgEapTju8rPsSvzqpS /8iw== X-Gm-Message-State: AAQBX9e4CoMl7rK9qgacef+8xDOUF+4Bey8gESl8qGCgtnNKWIvIOyR4 YHTfabMUe4LZRySTQ3scDUp2/8JhgLZgETZWrFZglA== X-Google-Smtp-Source: AKy350ZynCDdWK47LaePEkV/cZ3qvs8eoW52G6BT97BMnYVmvy9umNh9zcnTvFOHZha0dstS1VQxAA== X-Received: by 2002:a81:a141:0:b0:54f:ae01:79f4 with SMTP id y62-20020a81a141000000b0054fae0179f4mr15416015ywg.21.1681756624872; Mon, 17 Apr 2023 11:37:04 -0700 (PDT) Received: from system76-pc.ba.rivosinc.com ([136.57.172.92]) by smtp.gmail.com with ESMTPSA id 66-20020a810645000000b0054f6f65f258sm3278559ywg.16.2023.04.17.11.37.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Apr 2023 11:37:04 -0700 (PDT) From: Michael Collison To: gcc-patches@gcc.gnu.org Subject: [PATCH v4 04/10] RISC-V:autovec: Add target vectorization hooks Date: Mon, 17 Apr 2023 14:36:55 -0400 Message-Id: <20230417183701.2249183-5-collison@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230417183701.2249183-1-collison@rivosinc.com> References: <20230417183701.2249183-1-collison@rivosinc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: 2023-03-02 Michael Collison Juzhe Zhong * config/riscv/riscv.cc (riscv_option_override): Set riscv_vectorization_factor. (riscv_estimated_poly_value): Implement TARGET_ESTIMATED_POLY_VALUE. (riscv_preferred_simd_mode): Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE. (riscv_autovectorize_vector_modes): Implement TARGET_AUTOVECTORIZE_VECTOR_MODES. (riscv_get_mask_mode): Implement TARGET_VECTORIZE_GET_MASK_MODE. (riscv_empty_mask_is_expensive): Implement TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE. (riscv_vectorize_create_costs): Implement TARGET_VECTORIZE_CREATE_COSTS. (TARGET_ESTIMATED_POLY_VALUE): Register target macro. (TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto. (TARGET_VECTORIZE_GET_MASK_MODE): Ditto. (TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE): Ditto. (TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK): Ditto. --- gcc/config/riscv/riscv.cc | 156 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 156 insertions(+) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index dc47434fac4..9af06d926cf 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -60,6 +60,15 @@ along with GCC; see the file COPYING3. If not see #include "opts.h" #include "tm-constrs.h" #include "rtl-iter.h" +#include "gimple.h" +#include "cfghooks.h" +#include "cfgloop.h" +#include "cfgrtl.h" +#include "sel-sched.h" +#include "fold-const.h" +#include "gimple-iterator.h" +#include "gimple-expr.h" +#include "tree-vectorizer.h" /* This file should be included last. */ #include "target-def.h" @@ -275,6 +284,9 @@ poly_uint16 riscv_vector_chunks; /* The number of bytes in a vector chunk. */ unsigned riscv_bytes_per_vector_chunk; +/* Prefer vf for auto-vectorizer. */ +unsigned riscv_vectorization_factor; + /* Index R is the smallest register class that contains register R. */ const enum reg_class riscv_regno_to_class[FIRST_PSEUDO_REGISTER] = { GR_REGS, GR_REGS, GR_REGS, GR_REGS, @@ -6363,6 +6375,10 @@ riscv_option_override (void) /* Convert -march to a chunks count. */ riscv_vector_chunks = riscv_convert_vector_bits (); + + if (TARGET_VECTOR) + riscv_vectorization_factor = riscv_vector_lmul; + } /* Implement TARGET_CONDITIONAL_REGISTER_USAGE. */ @@ -7057,6 +7073,128 @@ riscv_dwarf_poly_indeterminate_value (unsigned int i, unsigned int *factor, return RISCV_DWARF_VLENB; } +/* Implement TARGET_ESTIMATED_POLY_VALUE. + Look into the tuning structure for an estimate. + KIND specifies the type of requested estimate: min, max or likely. + For cores with a known RVV width all three estimates are the same. + For generic RVV tuning we want to distinguish the maximum estimate from + the minimum and likely ones. + The likely estimate is the same as the minimum in that case to give a + conservative behavior of auto-vectorizing with RVV when it is a win + even for 128-bit RVV. + When RVV width information is available VAL.coeffs[1] is multiplied by + the number of VQ chunks over the initial Advanced SIMD 128 bits. */ + +static HOST_WIDE_INT +riscv_estimated_poly_value (poly_int64 val, + poly_value_estimate_kind kind = POLY_VALUE_LIKELY) +{ + unsigned int width_source = BITS_PER_RISCV_VECTOR.is_constant () + ? (unsigned int) BITS_PER_RISCV_VECTOR.to_constant () + : (unsigned int) RVV_SCALABLE; + + /* If there is no core-specific information then the minimum and likely + values are based on 128-bit vectors and the maximum is based on + the architectural maximum of 2048 bits. */ + if (width_source == RVV_SCALABLE) + switch (kind) + { + case POLY_VALUE_MIN: + case POLY_VALUE_LIKELY: + return val.coeffs[0]; + + case POLY_VALUE_MAX: + return val.coeffs[0] + val.coeffs[1] * 15; + } + + /* Allow BITS_PER_RISCV_VECTOR to be a bitmask of different VL, treating the + lowest as likely. This could be made more general if future -mtune + options need it to be. */ + if (kind == POLY_VALUE_MAX) + width_source = 1 << floor_log2 (width_source); + else + width_source = least_bit_hwi (width_source); + + /* If the core provides width information, use that. */ + HOST_WIDE_INT over_128 = width_source - 128; + return val.coeffs[0] + val.coeffs[1] * over_128 / 128; +} + +/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE. */ + +static machine_mode +riscv_preferred_simd_mode (scalar_mode mode) +{ + machine_mode vmode = + riscv_vector::riscv_vector_preferred_simd_mode (mode, + riscv_vectorization_factor); + if (VECTOR_MODE_P (vmode)) + return vmode; + + return word_mode; +} + +/* Implement TARGET_AUTOVECTORIZE_VECTOR_MODES for RVV. */ +static unsigned int +riscv_autovectorize_vector_modes (vector_modes *modes, bool) +{ + if (!TARGET_VECTOR) + return 0; + + if (riscv_vectorization_factor == RVV_LMUL1) + { + modes->safe_push (VNx16QImode); + modes->safe_push (VNx8QImode); + modes->safe_push (VNx4QImode); + modes->safe_push (VNx2QImode); + } + else if (riscv_vectorization_factor == RVV_LMUL2) + { + modes->safe_push (VNx32QImode); + modes->safe_push (VNx16QImode); + modes->safe_push (VNx8QImode); + modes->safe_push (VNx4QImode); + } + else if (riscv_vectorization_factor == RVV_LMUL4) + { + modes->safe_push (VNx64QImode); + modes->safe_push (VNx32QImode); + modes->safe_push (VNx16QImode); + modes->safe_push (VNx8QImode); + } + else + { + modes->safe_push (VNx64QImode); + modes->safe_push (VNx32QImode); + modes->safe_push (VNx16QImode); + } + + return 0; +} + +/* Implement TARGET_VECTORIZE_GET_MASK_MODE. */ + +static opt_machine_mode +riscv_get_mask_mode (machine_mode mode) +{ + machine_mode mask_mode = VOIDmode; + if (TARGET_VECTOR + && riscv_vector::riscv_vector_get_mask_mode (mode).exists (&mask_mode)) + return mask_mode; + + return default_get_mask_mode (mode); +} + +/* Implement TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE. Assume for now that + it isn't worth branching around empty masked ops (including masked + stores). */ + +static bool +riscv_empty_mask_is_expensive (unsigned) +{ + return false; +} + /* Return true if a shift-amount matches the trailing cleared bits on a bitmask. */ @@ -7382,6 +7520,24 @@ riscv_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs) #undef TARGET_VERIFY_TYPE_CONTEXT #define TARGET_VERIFY_TYPE_CONTEXT riscv_verify_type_context +#undef TARGET_ESTIMATED_POLY_VALUE +#define TARGET_ESTIMATED_POLY_VALUE riscv_estimated_poly_value + +#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE +#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE riscv_preferred_simd_mode + +#undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES +#define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES riscv_autovectorize_vector_modes + +#undef TARGET_VECTORIZE_GET_MASK_MODE +#define TARGET_VECTORIZE_GET_MASK_MODE riscv_get_mask_mode + +#undef TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE +#define TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE riscv_empty_mask_is_expensive + +#undef TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK +#define TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK riscv_loop_len_override_mask + #undef TARGET_VECTOR_ALIGNMENT #define TARGET_VECTOR_ALIGNMENT riscv_vector_alignment -- 2.34.1