From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by sourceware.org (Postfix) with ESMTPS id CEB4A3858C78 for ; Wed, 6 Sep 2023 16:22:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CEB4A3858C78 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=dabbelt.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=dabbelt.com Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-68a402c1fcdso26157b3a.1 for ; Wed, 06 Sep 2023 09:22:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dabbelt-com.20230601.gappssmtp.com; s=20230601; t=1694017363; x=1694622163; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:to:from:cc :in-reply-to:subject:date:from:to:cc:subject:date:message-id :reply-to; bh=EDmUI8Pj1c+/oUkSG6zw2ULSXsL4cOKGJSzTejR8VoI=; b=2/rWSzkruNZKb6utwSnoxyWICVkbS4hL5mSZ8NU6+rmCzMOput2jUnVYcoBAtiqJwe nD0XTc3Ngqzvsxvsh5avFlD8JW6IQbM5MyPZrTyPye7yOhJvuxhdWl9lNIy13LM9wBNc 5WFy3oIykiQ3ZL1G9bcRfdQ9vzHK5ZXQ/u+GeG3Gjoy3vCrH9rnNjpECuNfowEE8+pWy 2fawMNDYc5TkAvhlXRcAmOl3MPmZOSpmL4eGNZ5vnFzqmcMN5IaUOUgxbs8G+v5Sz37p RQaPoLjgtWWBBXMGAHTaVLCLaL0QLS9+t2EUJUHJJ2NPoXlbswTEkgGpOl5BjFl3hl52 J7xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1694017363; x=1694622163; h=content-transfer-encoding:mime-version:message-id:to:from:cc :in-reply-to:subject:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=EDmUI8Pj1c+/oUkSG6zw2ULSXsL4cOKGJSzTejR8VoI=; b=ibDFpsT2DHeSwuFmLu1AgIWyP1OyZKvp41p1GSphfGvBfwoGkhrH/uGiVNNWu4cFT+ wk5oWZ9XKgquU3rp/sQc40vuz0zzMTB2kdAPX7nAKmz+JkYMe1ZTrIIjMMYzE0FFfd2v F+01f9PHCxb2tgAWhjhRm4lN1hDBDtH/CkJRwyPpQ9A9/FlcmEKW4tukKdlxXob+kTt9 kQWUjZhwdUhcmdEqYHY7XAsWryq3NE9+l3WQl07Ejvv5nLTtIFKS3g3vle6JTxguU24x pd2+6V1uO5bsuAfRyX6EGFNFA6c0tU7rlj416enbHo0DqHC/gz7IobxcxTM331UMBdbR h91Q== X-Gm-Message-State: AOJu0Yy1So3Z5IbS2kEeOiJrwf8MwF7lJn3BaDELijdyJfZ7aOTz4KsZ u59YJsvCLnxxa8ByeRZ5Boq0l1+Rj6sixUSd410= X-Google-Smtp-Source: AGHT+IFRnhhRmHfWvyhg+tGr5e5OLFPGx0RS235CDSMIlGF7H4FfxAxX26nkPmIxkKBpClkUNFKpSA== X-Received: by 2002:a05:6a00:498e:b0:68e:2fa9:e6f2 with SMTP id dn14-20020a056a00498e00b0068e2fa9e6f2mr2267093pfb.34.1694017362968; Wed, 06 Sep 2023 09:22:42 -0700 (PDT) Received: from localhost ([135.180.227.0]) by smtp.gmail.com with ESMTPSA id f4-20020aa782c4000000b006870ff20254sm10989200pfn.125.2023.09.06.09.22.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Sep 2023 09:22:42 -0700 (PDT) Date: Wed, 06 Sep 2023 09:22:42 -0700 (PDT) X-Google-Original-Date: Wed, 06 Sep 2023 09:22:33 PDT (-0700) Subject: Re: [PATCH v2 1/2] riscv: Add support for strlen inline expansion In-Reply-To: <20230906160734.2422522-2-christoph.muellner@vrull.eu> CC: gcc-patches@gcc.gnu.org, kito.cheng@sifive.com, Jim Wilson , Andrew Waterman , philipp.tomsich@vrull.eu, jeffreyalaw@gmail.com, Vineet Gupta , christoph.muellner@vrull.eu From: Palmer Dabbelt To: christoph.muellner@vrull.eu Message-ID: Mime-Version: 1.0 (MHng) Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-8.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,GIT_PATCH_0,KAM_SHORT,LIKELY_SPAM_BODY,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 06 Sep 2023 09:07:33 PDT (-0700), christoph.muellner@vrull.eu wrote: > From: Christoph Müllner > > This patch implements the expansion of the strlen builtin for RV32/RV64 > for xlen-aligned aligned strings if Zbb or XTheadBb instructions are available. > The inserted sequences are: > > rv32gc_zbb (RV64 is similar): > add a3,a0,4 > li a4,-1 > .L1: lw a5,0(a0) > add a0,a0,4 > orc.b a5,a5 > beq a5,a4,.L1 > not a5,a5 > ctz a5,a5 > srl a5,a5,0x3 > add a0,a0,a5 > sub a0,a0,a3 > > rv64gc_xtheadbb (RV32 is similar): > add a4,a0,8 > .L2: ld a5,0(a0) > add a0,a0,8 > th.tstnbz a5,a5 > beqz a5,.L2 > th.rev a5,a5 > th.ff1 a5,a5 > srl a5,a5,0x3 > add a0,a0,a5 > sub a0,a0,a4 > > This allows to inline calls to strlen(), with optimized code for > xlen-aligned strings, resulting in the following benefits over > a call to libc: > * no call/ret instructions > * no stack frame allocation > * no register saving/restoring > * no alignment test > > The inlining mechanism is gated by a new switch ('-minline-strlen') > and by the variable 'optimize_size'. Maybe this is more of a Jeff question, but this looks to me like something that should be target-agnostic -- maybe we need some backend work to actually emit the special instruction, but IIRC this is a somewhat common flavor of instruction and is in other ISAs as well. It looks like there's already a strlen insn, so I guess the core issue is why we need that unspec? Sorry if I'm just missing something, though... > Tested using the glibc string tests. > > Signed-off-by: Christoph Müllner > > gcc/ChangeLog: > > * config.gcc: Add new object riscv-string.o. > riscv-string.cc. > * config/riscv/riscv-protos.h (riscv_expand_strlen): > New function. > * config/riscv/riscv.md (strlen): New expand INSN. > * config/riscv/riscv.opt: New flag 'minline-strlen'. > * config/riscv/t-riscv: Add new object riscv-string.o. > * config/riscv/thead.md (th_rev2): Export INSN name. > (th_rev2): Likewise. > (th_tstnbz2): New INSN. > * doc/invoke.texi: Document '-minline-strlen'. > * emit-rtl.cc (emit_likely_jump_insn): New helper function. > (emit_unlikely_jump_insn): Likewise. > * rtl.h (emit_likely_jump_insn): New prototype. > (emit_unlikely_jump_insn): Likewise. > * config/riscv/riscv-string.cc: New file. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/xtheadbb-strlen-unaligned.c: New test. > * gcc.target/riscv/xtheadbb-strlen.c: New test. > * gcc.target/riscv/zbb-strlen-disabled-2.c: New test. > * gcc.target/riscv/zbb-strlen-disabled.c: New test. > * gcc.target/riscv/zbb-strlen-unaligned.c: New test. > * gcc.target/riscv/zbb-strlen.c: New test. > --- > gcc/config.gcc | 3 +- > gcc/config/riscv/riscv-protos.h | 3 + > gcc/config/riscv/riscv-string.cc | 183 ++++++++++++++++++ > gcc/config/riscv/riscv.md | 28 +++ > gcc/config/riscv/riscv.opt | 4 + > gcc/config/riscv/t-riscv | 6 + > gcc/config/riscv/thead.md | 9 +- > gcc/doc/invoke.texi | 11 +- > gcc/emit-rtl.cc | 24 +++ > gcc/rtl.h | 2 + > .../riscv/xtheadbb-strlen-unaligned.c | 14 ++ > .../gcc.target/riscv/xtheadbb-strlen.c | 19 ++ > .../gcc.target/riscv/zbb-strlen-disabled-2.c | 15 ++ > .../gcc.target/riscv/zbb-strlen-disabled.c | 15 ++ > .../gcc.target/riscv/zbb-strlen-unaligned.c | 14 ++ > gcc/testsuite/gcc.target/riscv/zbb-strlen.c | 19 ++ > 16 files changed, 366 insertions(+), 3 deletions(-) > create mode 100644 gcc/config/riscv/riscv-string.cc > create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadbb-strlen-unaligned.c > create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadbb-strlen.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen.c > > diff --git a/gcc/config.gcc b/gcc/config.gcc > index b2fe7c7ceef..aff6b6a5601 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -530,7 +530,8 @@ pru-*-*) > ;; > riscv*) > cpu_type=riscv > - extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o riscv-vector-costs.o" > + extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o riscv-string.o" > + extra_objs="${extra_objs} riscv-v.o riscv-vsetvl.o riscv-vector-costs.o" > extra_objs="${extra_objs} riscv-vector-builtins.o riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o" > extra_objs="${extra_objs} thead.o" > d_target_objs="riscv-d.o" > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h > index 6dbf6b9f943..b060d047f01 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -517,6 +517,9 @@ const unsigned int RISCV_BUILTIN_SHIFT = 1; > /* Mask that selects the riscv_builtin_class part of a function code. */ > const unsigned int RISCV_BUILTIN_CLASS = (1 << RISCV_BUILTIN_SHIFT) - 1; > > +/* Routines implemented in riscv-string.cc. */ > +extern bool riscv_expand_strlen (rtx, rtx, rtx, rtx); > + > /* Routines implemented in thead.cc. */ > extern bool th_mempair_operands_p (rtx[4], bool, machine_mode); > extern void th_mempair_order_operands (rtx[4], bool, machine_mode); > diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc > new file mode 100644 > index 00000000000..086900a6083 > --- /dev/null > +++ b/gcc/config/riscv/riscv-string.cc > @@ -0,0 +1,183 @@ > +/* Subroutines used to expand string operations for RISC-V. > + Copyright (C) 2023 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify it > + under the terms of the GNU General Public License as published > + by the Free Software Foundation; either version 3, or (at your > + option) any later version. > + > + GCC is distributed in the hope that it will be useful, but WITHOUT > + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY > + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public > + License for more details. > + > + You should have received a copy of the GNU General Public License > + along with GCC; see the file COPYING3. If not see > + . */ > + > +#define IN_TARGET_CODE 1 > + > +#include "config.h" > +#include "system.h" > +#include "coretypes.h" > +#include "backend.h" > +#include "rtl.h" > +#include "tree.h" > +#include "memmodel.h" > +#include "tm_p.h" > +#include "ira.h" > +#include "print-tree.h" > +#include "varasm.h" > +#include "explow.h" > +#include "expr.h" > +#include "output.h" > +#include "target.h" > +#include "predict.h" > +#include "optabs.h" > + > +/* Emit proper instruction depending on mode of dest. */ > + > +#define GEN_EMIT_HELPER2(name) \ > +static rtx_insn * \ > +do_## name ## 2(rtx dest, rtx src) \ > +{ \ > + rtx_insn *insn; \ > + if (GET_MODE (dest) == DImode) \ > + insn = emit_insn (gen_ ## name ## di2 (dest, src)); \ > + else \ > + insn = emit_insn (gen_ ## name ## si2 (dest, src)); \ > + return insn; \ > +} > + > +/* Emit proper instruction depending on mode of dest. */ > + > +#define GEN_EMIT_HELPER3(name) \ > +static rtx_insn * \ > +do_## name ## 3(rtx dest, rtx src1, rtx src2) \ > +{ \ > + rtx_insn *insn; \ > + if (GET_MODE (dest) == DImode) \ > + insn = emit_insn (gen_ ## name ## di3 (dest, src1, src2)); \ > + else \ > + insn = emit_insn (gen_ ## name ## si3 (dest, src1, src2)); \ > + return insn; \ > +} > + > +GEN_EMIT_HELPER3(add) /* do_add3 */ > +GEN_EMIT_HELPER2(clz) /* do_clz2 */ > +GEN_EMIT_HELPER2(ctz) /* do_ctz2 */ > +GEN_EMIT_HELPER3(lshr) /* do_lshr3 */ > +GEN_EMIT_HELPER2(orcb) /* do_orcb2 */ > +GEN_EMIT_HELPER2(one_cmpl) /* do_one_cmpl2 */ > +GEN_EMIT_HELPER3(sub) /* do_sub3 */ > +GEN_EMIT_HELPER2(th_rev) /* do_th_rev2 */ > +GEN_EMIT_HELPER2(th_tstnbz) /* do_th_tstnbz2 */ > +GEN_EMIT_HELPER2(zero_extendqi) /* do_zero_extendqi2 */ > + > +#undef GEN_EMIT_HELPER2 > +#undef GEN_EMIT_HELPER3 > + > +/* Helper function to load a byte or a Pmode register. > + > + MODE is the mode to use for the load (QImode or Pmode). > + DEST is the destination register for the data. > + ADDR_REG is the register that holds the address. > + ADDR is the address expression to load from. > + > + This function returns an rtx containing the register, > + where the ADDR is stored. */ > + > +static rtx > +do_load_from_addr (machine_mode mode, rtx dest, rtx addr_reg, rtx addr) > +{ > + rtx mem = gen_rtx_MEM (mode, addr_reg); > + MEM_COPY_ATTRIBUTES (mem, addr); > + set_mem_size (mem, GET_MODE_SIZE (mode)); > + > + if (mode == QImode) > + do_zero_extendqi2 (dest, mem); > + else if (mode == Xmode) > + emit_move_insn (dest, mem); > + else > + gcc_unreachable (); > + > + return addr_reg; > +} > + > +/* If the provided string is aligned, then read XLEN bytes > + in a loop and use orc.b to find NUL-bytes. */ > + > +static bool > +riscv_expand_strlen_scalar (rtx result, rtx src, rtx align) > +{ > + rtx testval, addr, addr_plus_regsz, word, zeros; > + rtx loop_label, cond; > + > + gcc_assert (TARGET_ZBB || TARGET_XTHEADBB); > + > + /* The alignment needs to be known and big enough. */ > + if (!CONST_INT_P (align) || UINTVAL (align) < GET_MODE_SIZE (Xmode)) > + return false; > + > + testval = gen_reg_rtx (Xmode); > + addr = copy_addr_to_reg (XEXP (src, 0)); > + addr_plus_regsz = gen_reg_rtx (Pmode); > + word = gen_reg_rtx (Xmode); > + zeros = gen_reg_rtx (Xmode); > + > + if (TARGET_ZBB) > + emit_insn (gen_rtx_SET (testval, constm1_rtx)); > + else > + emit_insn (gen_rtx_SET (testval, const0_rtx)); > + > + do_add3 (addr_plus_regsz, addr, GEN_INT (UNITS_PER_WORD)); > + > + loop_label = gen_label_rtx (); > + emit_label (loop_label); > + > + /* Load a word and use orc.b/th.tstnbz to find a zero-byte. */ > + do_load_from_addr (Xmode, word, addr, src); > + do_add3 (addr, addr, GEN_INT (UNITS_PER_WORD)); > + if (TARGET_ZBB) > + do_orcb2 (word, word); > + else > + do_th_tstnbz2 (word, word); > + cond = gen_rtx_EQ (VOIDmode, word, testval); > + emit_unlikely_jump_insn (gen_cbranch4 (Xmode, cond, word, testval, loop_label)); > + > + /* Calculate the return value by counting zero-bits. */ > + if (TARGET_ZBB) > + do_one_cmpl2 (word, word); > + if (TARGET_BIG_ENDIAN) > + do_clz2 (zeros, word); > + else if (TARGET_ZBB) > + do_ctz2 (zeros, word); > + else > + { > + do_th_rev2 (word, word); > + do_clz2 (zeros, word); > + } > + > + do_lshr3 (zeros, zeros, GEN_INT (exact_log2 (BITS_PER_UNIT))); > + do_add3 (addr, addr, zeros); > + do_sub3 (result, addr, addr_plus_regsz); > + > + return true; > +} > + > +/* Expand a strlen operation and return true if successful. > + Return false if we should let the compiler generate normal > + code, probably a strlen call. */ > + > +bool > +riscv_expand_strlen (rtx result, rtx src, rtx search_char, rtx align) > +{ > + gcc_assert (search_char == const0_rtx); > + > + if (TARGET_ZBB || TARGET_XTHEADBB) > + return riscv_expand_strlen_scalar (result, src, align); > + > + return false; > +} > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > index 9da2a9f1c42..e078ebc43cb 100644 > --- a/gcc/config/riscv/riscv.md > +++ b/gcc/config/riscv/riscv.md > @@ -82,6 +82,9 @@ (define_c_enum "unspec" [ > > ;; the calling convention of callee > UNSPEC_CALLEE_CC > + > + ;; String unspecs > + UNSPEC_STRLEN > ]) > > (define_c_enum "unspecv" [ > @@ -3500,6 +3503,31 @@ (define_expand "msubhisi4" > "TARGET_XTHEADMAC" > ) > > +;; Search character in string (generalization of strlen). > +;; Argument 0 is the resulting offset > +;; Argument 1 is the string > +;; Argument 2 is the search character > +;; Argument 3 is the alignment > + > +(define_expand "strlen" > + [(set (match_operand:X 0 "register_operand") > + (unspec:X [(match_operand:BLK 1 "general_operand") > + (match_operand:SI 2 "const_int_operand") > + (match_operand:SI 3 "const_int_operand")] > + UNSPEC_STRLEN))] > + "riscv_inline_strlen && !optimize_size && (TARGET_ZBB || TARGET_XTHEADBB)" > +{ > + rtx search_char = operands[2]; > + > + if (search_char != const0_rtx) > + FAIL; > + > + if (riscv_expand_strlen (operands[0], operands[1], operands[2], operands[3])) > + DONE; > + else > + FAIL; > +}) > + > (include "bitmanip.md") > (include "crypto.md") > (include "sync.md") > diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt > index 98f342348b7..2491b335aef 100644 > --- a/gcc/config/riscv/riscv.opt > +++ b/gcc/config/riscv/riscv.opt > @@ -278,6 +278,10 @@ minline-atomics > Target Var(TARGET_INLINE_SUBWORD_ATOMIC) Init(1) > Always inline subword atomic operations. > > +minline-strlen > +Target Bool Var(riscv_inline_strlen) Init(0) > +Inline strlen calls if possible. > + > Enum > Name(riscv_autovec_preference) Type(enum riscv_autovec_preference_enum) > Valid arguments to -param=riscv-autovec-preference=: > diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv > index b1f80d1d87c..c012ac0cf33 100644 > --- a/gcc/config/riscv/t-riscv > +++ b/gcc/config/riscv/t-riscv > @@ -91,6 +91,12 @@ riscv-selftests.o: $(srcdir)/config/riscv/riscv-selftests.cc \ > $(COMPILE) $< > $(POSTCOMPILE) > > +riscv-string.o: $(srcdir)/config/riscv/riscv-string.cc \ > + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TARGET_H) backend.h $(RTL_H) \ > + memmodel.h $(EMIT_RTL_H) poly-int.h output.h > + $(COMPILE) $< > + $(POSTCOMPILE) > + > riscv-v.o: $(srcdir)/config/riscv/riscv-v.cc \ > $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \ > $(TM_P_H) $(TARGET_H) memmodel.h insn-codes.h $(OPTABS_H) $(RECOG_H) \ > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md > index 29f98dec3a8..982b048cb65 100644 > --- a/gcc/config/riscv/thead.md > +++ b/gcc/config/riscv/thead.md > @@ -110,7 +110,7 @@ (define_insn "*th_clz2" > [(set_attr "type" "bitmanip") > (set_attr "mode" "")]) > > -(define_insn "*th_rev2" > +(define_insn "th_rev2" > [(set (match_operand:GPR 0 "register_operand" "=r") > (bswap:GPR (match_operand:GPR 1 "register_operand" "r")))] > "TARGET_XTHEADBB && (TARGET_64BIT || mode == SImode)" > @@ -121,6 +121,13 @@ (define_insn "*th_rev2" > [(set_attr "type" "bitmanip") > (set_attr "mode" "")]) > > +(define_insn "th_tstnbz2" > + [(set (match_operand:X 0 "register_operand" "=r") > + (unspec:X [(match_operand:X 1 "register_operand" "r")] UNSPEC_ORC_B))] > + "TARGET_XTHEADBB" > + "th.tstnbz\t%0,%1" > + [(set_attr "type" "bitmanip")]) > + > ;; XTheadBs > > (define_insn "*th_tst3" > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 33befee7d6b..4a9e385d009 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -1236,7 +1236,8 @@ See RS/6000 and PowerPC Options. > -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{reg} > -mstack-protector-guard-offset=@var{offset} > -mcsr-check -mno-csr-check > --minline-atomics -mno-inline-atomics} > +-minline-atomics -mno-inline-atomics > +-minline-strlen -mno-inline-strlen} > > @emph{RL78 Options} > @gccoptlist{-msim -mmul=none -mmul=g13 -mmul=g14 -mallregs > @@ -29359,6 +29360,14 @@ Do or don't use smaller but slower subword atomic emulation code that uses > libatomic function calls. The default is to use fast inline subword atomics > that do not require libatomic. > > +@opindex minline-strlen > +@item -minline-strlen > +@itemx -mno-inline-strlen > +Do or do not attempt to inline strlen calls if possible. > +Inlining will only be done if the string is properly aligned > +and instructions for accelerated processing are available. > +The default is to not inline strlen calls. > + > @opindex mshorten-memrefs > @item -mshorten-memrefs > @itemx -mno-shorten-memrefs > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc > index f6276a2d0b6..8bd623dcd0e 100644 > --- a/gcc/emit-rtl.cc > +++ b/gcc/emit-rtl.cc > @@ -5168,6 +5168,30 @@ emit_jump_insn (rtx x) > return last; > } > > +/* Make an insn of code JUMP_INSN with pattern X, > + add a REG_BR_PROB note that indicates very likely probability, > + and add it to the end of the doubly-linked list. */ > + > +rtx_insn * > +emit_likely_jump_insn (rtx x) > +{ > + rtx_insn *jump = emit_jump_insn (x); > + add_reg_br_prob_note (jump, profile_probability::very_likely ()); > + return jump; > +} > + > +/* Make an insn of code JUMP_INSN with pattern X, > + add a REG_BR_PROB note that indicates very unlikely probability, > + and add it to the end of the doubly-linked list. */ > + > +rtx_insn * > +emit_unlikely_jump_insn (rtx x) > +{ > + rtx_insn *jump = emit_jump_insn (x); > + add_reg_br_prob_note (jump, profile_probability::very_unlikely ()); > + return jump; > +} > + > /* Make an insn of code CALL_INSN with pattern X > and add it to the end of the doubly-linked list. */ > > diff --git a/gcc/rtl.h b/gcc/rtl.h > index 0e9491b89b4..102ad9b57a6 100644 > --- a/gcc/rtl.h > +++ b/gcc/rtl.h > @@ -3347,6 +3347,8 @@ extern rtx_note *emit_note_after (enum insn_note, rtx_insn *); > extern rtx_insn *emit_insn (rtx); > extern rtx_insn *emit_debug_insn (rtx); > extern rtx_insn *emit_jump_insn (rtx); > +extern rtx_insn *emit_likely_jump_insn (rtx); > +extern rtx_insn *emit_unlikely_jump_insn (rtx); > extern rtx_insn *emit_call_insn (rtx); > extern rtx_code_label *emit_label (rtx); > extern rtx_jump_table_data *emit_jump_table_data (rtx); > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadbb-strlen-unaligned.c b/gcc/testsuite/gcc.target/riscv/xtheadbb-strlen-unaligned.c > new file mode 100644 > index 00000000000..57a6b5ea66a > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/xtheadbb-strlen-unaligned.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-minline-strlen -march=rv32gc_xtheadbb" { target { rv32 } } } */ > +/* { dg-options "-minline-strlen -march=rv64gc_xtheadbb" { target { rv64 } } } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */ > + > +typedef long unsigned int size_t; > + > +size_t > +my_str_len (const char *s) > +{ > + return __builtin_strlen (s); > +} > + > +/* { dg-final { scan-assembler-not "th.tstnbz\t" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadbb-strlen.c b/gcc/testsuite/gcc.target/riscv/xtheadbb-strlen.c > new file mode 100644 > index 00000000000..dbc8d1e7da7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/xtheadbb-strlen.c > @@ -0,0 +1,19 @@ > +/* { dg-do compile } */ > +/* { dg-options "-minline-strlen -march=rv32gc_xtheadbb" { target { rv32 } } } */ > +/* { dg-options "-minline-strlen -march=rv64gc_xtheadbb" { target { rv64 } } } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */ > + > +typedef long unsigned int size_t; > + > +size_t > +my_str_len (const char *s) > +{ > + s = __builtin_assume_aligned (s, 4096); > + return __builtin_strlen (s); > +} > + > +/* { dg-final { scan-assembler "th.tstnbz\t" } } */ > +/* { dg-final { scan-assembler-not "jalr" } } */ > +/* { dg-final { scan-assembler-not "call" } } */ > +/* { dg-final { scan-assembler-not "jr" } } */ > +/* { dg-final { scan-assembler-not "tail" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled-2.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled-2.c > new file mode 100644 > index 00000000000..a481068aa0c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled-2.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv32gc_zbb" { target { rv32 } } } */ > +/* { dg-options "-march=rv64gc_zbb" { target { rv64 } } } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */ > + > +typedef long unsigned int size_t; > + > +size_t > +my_str_len (const char *s) > +{ > + s = __builtin_assume_aligned (s, 4096); > + return __builtin_strlen (s); > +} > + > +/* { dg-final { scan-assembler-not "orc.b\t" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled.c > new file mode 100644 > index 00000000000..1295aeb0086 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mno-inline-strlen -march=rv32gc_zbb" { target { rv32 } } } */ > +/* { dg-options "-mno-inline-strlen -march=rv64gc_zbb" { target { rv64 } } } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */ > + > +typedef long unsigned int size_t; > + > +size_t > +my_str_len (const char *s) > +{ > + s = __builtin_assume_aligned (s, 4096); > + return __builtin_strlen (s); > +} > + > +/* { dg-final { scan-assembler-not "orc.b\t" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c > new file mode 100644 > index 00000000000..326fef885d8 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-minline-strlen -march=rv32gc_zbb" { target { rv32 } } } */ > +/* { dg-options "-minline-strlen -march=rv64gc_zbb" { target { rv64 } } } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */ > + > +typedef long unsigned int size_t; > + > +size_t > +my_str_len (const char *s) > +{ > + return __builtin_strlen (s); > +} > + > +/* { dg-final { scan-assembler-not "orc.b\t" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen.c > new file mode 100644 > index 00000000000..19ebfaef16f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen.c > @@ -0,0 +1,19 @@ > +/* { dg-do compile } */ > +/* { dg-options "-minline-strlen -march=rv32gc_zbb" { target { rv32 } } } */ > +/* { dg-options "-minline-strlen -march=rv64gc_zbb" { target { rv64 } } } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Og" "-Oz" } } */ > + > +typedef long unsigned int size_t; > + > +size_t > +my_str_len (const char *s) > +{ > + s = __builtin_assume_aligned (s, 4096); > + return __builtin_strlen (s); > +} > + > +/* { dg-final { scan-assembler "orc.b\t" } } */ > +/* { dg-final { scan-assembler-not "jalr" } } */ > +/* { dg-final { scan-assembler-not "call" } } */ > +/* { dg-final { scan-assembler-not "jr" } } */ > +/* { dg-final { scan-assembler-not "tail" } } */