From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22f.google.com (mail-lj1-x22f.google.com [IPv6:2a00:1450:4864:20::22f]) by sourceware.org (Postfix) with ESMTPS id E3C6F3857C75 for ; Tue, 8 Nov 2022 19:57:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E3C6F3857C75 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-lj1-x22f.google.com with SMTP id c25so22697098ljr.8 for ; Tue, 08 Nov 2022 11:57:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=XyiVXIQK2a6LwHt3PBjVmZOYrUohpk5nn0RBsL4ckKU=; b=bxF1/J5fbiF7dmkiS/2qZSqAZ9/Ttbxn/iINvJ8IRXglk4or1au8adrXRfs1JP/7af 31GUB25o6hzNmbi63nb972Rt7z3ctmJBnY4i/6GasUxEDlfHyOcb+LploZA3ZuTUWVrL 09Jwp9bcJFkNDrv0zU4lSLnz2ahlUKbjqmKMeB0rqBLGXkr6aoeV81i3sAG6Mw8i3EYB aNyfvKLSit0VK92eAJ+3FvuiGSsKOg91+6Igepzn8vfq+OLL9bNnhC1pZQDBVgaCDtbh uBeKO5f9Mt8nqDaKavLlAUELwf+dIB8wc+SI373NY5OdfOX4OM96Zn2HJ5EkT5nOfnCB USxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XyiVXIQK2a6LwHt3PBjVmZOYrUohpk5nn0RBsL4ckKU=; b=IbOWphxySvlxpVWFOeRycqqTrPFer5ZVwxnxmzJmyoIkAvgxl3qLSVXMBBCWW+CjJ5 QknDV8HTDE5E1Sn3Pr/m8zM9PIcP+fIO3LbJCC/O+3kLzBI4xpYmXztL1ECjfnEYpG27 GcFP0HZXuIINXqjk18DJw+Mbmu6xFxubkYEgM+DfoScOqrjZBEJsRlkoXGnCWyjqkaTn WpEguQCu3bsIDboZVDLLLwAc1BxCN1xhzlBa2VoaHL4GN91t+0H+i4HbDpY5yGgVA3ZT LOhvul/HAm47eNjBSC7XF+6u0nj8fkmWS1jPgk5ndWYso7Z4Z8JkDJD0u1/W9TK8CAO2 aaIg== X-Gm-Message-State: ACrzQf3LHlEGFAWyeYG8efpZarD9ZjfrSfVqEjJdqUyvxGvcALAMkprJ +Fege0c02Tn8cdS47gv6WZoU5pOMvx6rFVWf X-Google-Smtp-Source: AMsMyM6pit3frzg83pVTVdP6vDeQ7Czsr2cIABkWyT7dYuPkKw2DINDl+4I4e62wo/qPYchP6NSsYw== X-Received: by 2002:a2e:9083:0:b0:277:e69:f69 with SMTP id l3-20020a2e9083000000b002770e690f69mr21068301ljg.358.1667937453110; Tue, 08 Nov 2022 11:57:33 -0800 (PST) Received: from ubuntu-focal.. ([2a01:4f9:3a:1e26::2]) by smtp.gmail.com with ESMTPSA id u5-20020ac258c5000000b0049464d89e40sm1921950lfo.72.2022.11.08.11.57.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Nov 2022 11:57:32 -0800 (PST) From: Philipp Tomsich To: gcc-patches@gcc.gnu.org Cc: Christoph Muellner , Palmer Dabbelt , Kito Cheng , Vineet Gupta , Jeff Law , Philipp Tomsich Subject: [PATCH] RISC-V: Optimize slli(.uw)? + addw + zext.w into sh[123]add + zext.w Date: Tue, 8 Nov 2022 20:57:30 +0100 Message-Id: <20221108195730.2701496-1-philipp.tomsich@vrull.eu> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,JMQ_SPF_NEUTRAL,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: gcc/ChangeLog: * config/riscv/bitmanip.md: Handle corner-cases for combine when chaining slli(.uw)? + addw gcc/testsuite/ChangeLog: * gcc.target/riscv/zba-shNadd-04.c: New test. --- gcc/config/riscv/bitmanip.md | 49 +++++++++++++++++++ gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv.cc | 7 +++ .../gcc.target/riscv/zba-shNadd-04.c | 23 +++++++++ 4 files changed, 80 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shNadd-04.c diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index 726a07b0d90..cbc00455b67 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -56,6 +56,55 @@ [(set (match_dup 5) (plus:DI (ashift:DI (match_dup 1) (match_dup 2)) (match_dup 3))) (set (match_dup 0) (sign_extend:DI (div:SI (subreg:SI (match_dup 5) 0) (subreg:SI (match_dup 4) 0))))]) +; Zba does not provide W-forms of sh[123]add(.uw)?, which leads to an +; interesting irregularity: we can generate a signed 32-bit result +; using slli(.uw)?+ addw, but a unsigned 32-bit result can be more +; efficiently be generated as sh[123]add+zext.w (the .uw can be +; dropped, if we zero-extend the output anyway). +; +; To enable this optimization, we split [ slli(.uw)?, addw, zext.w ] +; into [ sh[123]add, zext.w ] for use during combine. +(define_split + [(set (match_operand:DI 0 "register_operand") + (zero_extend:DI (plus:SI (ashift:SI (subreg:SI (match_operand:DI 1 "register_operand") 0) + (match_operand:QI 2 "imm123_operand")) + (subreg:SI (match_operand:DI 3 "register_operand") 0))))] + "TARGET_64BIT && TARGET_ZBA" + [(set (match_dup 0) (plus:DI (ashift:DI (match_dup 1) (match_dup 2)) (match_dup 3))) + (set (match_dup 0) (zero_extend:DI (subreg:SI (match_dup 0) 0)))]) + +(define_split + [(set (match_operand:DI 0 "register_operand") + (zero_extend:DI (plus:SI (subreg:SI (and:DI (ashift:DI (match_operand:DI 1 "register_operand") + (match_operand:QI 2 "imm123_operand")) + (match_operand:DI 3 "consecutive_bits_operand")) 0) + (subreg:SI (match_operand:DI 4 "register_operand") 0))))] + "TARGET_64BIT && TARGET_ZBA + && riscv_shamt_matches_mask_p (INTVAL (operands[2]), INTVAL (operands[3]))" + [(set (match_dup 0) (plus:DI (ashift:DI (match_dup 1) (match_dup 2)) (match_dup 4))) + (set (match_dup 0) (zero_extend:DI (subreg:SI (match_dup 0) 0)))]) + +; Make sure that an andi followed by a sh[123]add remains a two instruction +; sequence--and is not torn apart into slli, slri, add. +(define_insn_and_split "*andi_add.uw" + [(set (match_operand:DI 0 "register_operand" "=r") + (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r") + (match_operand:QI 2 "imm123_operand" "Ds3")) + (match_operand:DI 3 "consecutive_bits_operand" "")) + (match_operand:DI 4 "register_operand" "r"))) + (clobber (match_scratch:DI 5 "=&r"))] + "TARGET_64BIT && TARGET_ZBA + && riscv_shamt_matches_mask_p (INTVAL (operands[2]), INTVAL (operands[3])) + && SMALL_OPERAND (INTVAL (operands[3]) >> INTVAL (operands[2]))" + "#" + "&& reload_completed" + [(set (match_dup 5) (and:DI (match_dup 1) (match_dup 3))) + (set (match_dup 0) (plus:DI (ashift:DI (match_dup 5) (match_dup 2)) + (match_dup 4)))] +{ + operands[3] = GEN_INT (INTVAL (operands[3]) >> INTVAL (operands[2])); +}) + (define_insn "*shNadduw" [(set (match_operand:DI 0 "register_operand" "=r") (plus:DI diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 5a718bb62b4..2ec3af05aa4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -77,6 +77,7 @@ extern bool riscv_gpr_save_operation_p (rtx); extern void riscv_reinit (void); extern poly_uint64 riscv_regmode_natural_size (machine_mode); extern bool riscv_v_ext_vector_mode_p (machine_mode); +extern bool riscv_shamt_matches_mask_p (int, HOST_WIDE_INT); /* Routines implemented in riscv-c.cc. */ void riscv_cpu_cpp_builtins (cpp_reader *); diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 0b2c4b3599d..5a632058003 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -6497,6 +6497,13 @@ riscv_regmode_natural_size (machine_mode mode) return UNITS_PER_WORD; } +/* Return true if a shift-amount matches the trailing cleared bits on a bitmask */ +bool +riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask) +{ + return shamt == ctz_hwi (mask); +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" diff --git a/gcc/testsuite/gcc.target/riscv/zba-shNadd-04.c b/gcc/testsuite/gcc.target/riscv/zba-shNadd-04.c new file mode 100644 index 00000000000..abed1491039 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zba-shNadd-04.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zba -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ + +long long sub1(unsigned long long a, unsigned long long b) +{ + b = (b << 32) >> 31; + unsigned int x = a + b; + return x; +} + +long long sub2(unsigned long long a, unsigned long long b) +{ + return (unsigned int)(a + (b << 1)); +} + +long long sub3(unsigned long long a, unsigned long long b) +{ + return (a + (b << 1)) & ~0u; +} + +/* { dg-final { scan-assembler-times "sh1add" 3 } } */ +/* { dg-final { scan-assembler-times "zext.w\t" 3 } } */ -- 2.34.1