From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by sourceware.org (Postfix) with ESMTPS id 5F4B23857806 for ; Wed, 6 Sep 2023 16:07:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5F4B23857806 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-9a21b6d105cso569332566b.3 for ; Wed, 06 Sep 2023 09:07:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1694016464; x=1694621264; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=oco1m6UO11OasvbJ5D/xjRTSmgjVcIE6u0wKsMK6YUM=; b=oPE3sFWVBq7jAtwKNlcG4kQJN/tzW3OnLBZaTm2HggyUNr4ZZhIQMwJUS1OJlGjXY6 8TKqK1ePWrn4EKY76pLqxDZfqT7EQREKkH0sWbo8+IQ1CqxcI4qyEda4ZE3jDHPfHME8 JRBCVE0beIy4hVhYyjnh8MEczmDlM2NJJ08eWOK6rGMMby/t9gU1dvLbmBF2pG0N1si5 zTT2uerFa8tdtywpuUSjczpEGYz2foWPtf/2QMM7rXAgUSWJixIqlcX4ncBQ5xp2a4uS safajuT4+yTJjA6ekRJ1Ckxnga4v+sHadPXUTsVzLSdC7ic0zDuKah9POwR4rSVVn/ng a2xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1694016464; x=1694621264; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=oco1m6UO11OasvbJ5D/xjRTSmgjVcIE6u0wKsMK6YUM=; b=aMD4buLsZyhsA6J8DRdCOMWOUHAOOAsptEcVZY5Qm9X8Or0EwE2CHQWmDoGK7vQ5aK yQEgBBzikkcbpb0yDle56IuPLSPCwTDWoXDrsBUXXpEwkTcfXSasnzd25ndTSMRJ2Qq9 m/lHnoqHxjbbFXxLltflluHcNO0RlPLm9fp9kB+MuVVqEzlqkl9MMk7UwZtxI6imvtlE vNl3oYA+UXLhnHbxtBZsvYrAc/eQsJ9n37XKJYnNUQmFlXSUrjOe3u178BAkh4qGFlCj 7kIybdmIgjxHdOwndvWjmh3iVZkXNYKdoL/D3ORJhEBz6gvcwaQqItQRzOYuQVY+KEta VtBw== X-Gm-Message-State: AOJu0YzSB83p7lJVn45YojEV96dqd23G3EHOjiw1QFU5t+FrEDCPKqeK RTf2q1cGkKHvAbSC7cs+tYVhr0pY7NHiZ5QNcUY= X-Google-Smtp-Source: AGHT+IGmuZFhJhDxsWap1RvIfk9FIVx+L8h+JrVERCgIYRfFMoaYVq+poJVMZGeIErAFxUQfyVZQqg== X-Received: by 2002:a17:906:768d:b0:9a2:ecd:d95d with SMTP id o13-20020a170906768d00b009a20ecdd95dmr3121306ejm.68.1694016464485; Wed, 06 Sep 2023 09:07:44 -0700 (PDT) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id oz19-20020a170906cd1300b0098e42bef736sm9330351ejb.176.2023.09.06.09.07.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Sep 2023 09:07:43 -0700 (PDT) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?UTF-8?q?Christoph=20M=C3=BCllner?= Subject: [PATCH v2 0/2] riscv: Introduce strlen/strcmp/strncmp inline expansion Date: Wed, 6 Sep 2023 18:07:32 +0200 Message-ID: <20230906160734.2422522-1-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,JMQ_SPF_NEUTRAL,KAM_MANYTO,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: From: Christoph Müllner This series introduces strlen/strcmp/strncmp inline expansion for Zbb/XTheadBb. In the last months, glibc as well as the Linux kernel merged changes for optimized string processing for RISC-V. The instruction, which enables optimized string routines is Zbb's orc.b (or T-Head's th.tstnbz) instruction. This patch attempts to add optimized string processing to GCC with the following properties: * strlen: inline a loop if the string is xlen-aligned * strcmp/strncmp: inline a peeled comparison loop sequence if both strings are xlen-aligned I've already posted the idea in a previous series last November (therefore, this series is called 'v2'): * https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605996.html * https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605998.html Back then, there were a couple of comments, which have been addressed, but the str(n)cmp patch has been restructured to make the code easier to digest. In total the following changes are made: * Address Jeff's comments for the strlen patch * Change str(n)cmp flags according to Kito's comments * Ensure that all flags are documented * Break str(n)cmp expansion into several functions * Add support for XTheadBb's th.tstnbz I have not introduced "-minline-str[n]cmp=[bitmanip|vector|auto]" or "-mstringop-strategy=alg" because we only have one bitmanip/scalar expansion. But it is possible to add this in the future (or not and decide based on mtune). By default all optimizations are disabled, so there should be no risk of regressions. Testing was done using the following strategy: * Enablement/flag tests are part of the patches * Correctness was tested using qemu-user with glibc's string tests compiled for: ** rv64gc (baseline) QEMU_CPU=rv64 ** rv64gc_zbb (limit=64) QEMU_CPU=rv64,zbb=false (must fail) ** rv64gc_zbb (limit=64) QEMU_CPU=rv64,zbb=true ** rv64gc_zbb (limit=32) QEMU_CPU=rv64,zbb=true ** rv64gc_xtheadbb (limit=64) QEMU_CPU=rv64 (must fail) ** rv64gc_xtheadbb (limit=64) QEMU_CPU=thead-c906 ** rv64gc_xtheadbb (limit=8) QEMU_CPU=thead-c906 ** rv32gc_zbb (limit=64) QEMU_CPU=rv32,zbb=true * SPEC CPU 2017 intrate base/peak with LTO Christoph Müllner (2): riscv: Add support for strlen inline expansion riscv: Add support for str(n)cmp inline expansion gcc/config.gcc | 3 +- gcc/config/riscv/bitmanip.md | 2 +- gcc/config/riscv/riscv-protos.h | 4 + gcc/config/riscv/riscv-string.cc | 594 ++++++++++++++++++ gcc/config/riscv/riscv.md | 72 ++- gcc/config/riscv/riscv.opt | 16 + gcc/config/riscv/t-riscv | 6 + gcc/config/riscv/thead.md | 9 +- gcc/doc/invoke.texi | 29 +- gcc/emit-rtl.cc | 24 + gcc/rtl.h | 2 + .../gcc.target/riscv/xtheadbb-strcmp.c | 57 ++ .../riscv/xtheadbb-strlen-unaligned.c | 14 + .../gcc.target/riscv/xtheadbb-strlen.c | 19 + .../gcc.target/riscv/zbb-strcmp-disabled-2.c | 38 ++ .../gcc.target/riscv/zbb-strcmp-disabled.c | 38 ++ .../gcc.target/riscv/zbb-strcmp-limit.c | 57 ++ .../gcc.target/riscv/zbb-strcmp-unaligned.c | 38 ++ gcc/testsuite/gcc.target/riscv/zbb-strcmp.c | 57 ++ .../gcc.target/riscv/zbb-strlen-disabled-2.c | 15 + .../gcc.target/riscv/zbb-strlen-disabled.c | 15 + .../gcc.target/riscv/zbb-strlen-unaligned.c | 14 + gcc/testsuite/gcc.target/riscv/zbb-strlen.c | 19 + 23 files changed, 1137 insertions(+), 5 deletions(-) create mode 100644 gcc/config/riscv/riscv-string.cc create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadbb-strcmp.c create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadbb-strlen-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadbb-strlen.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-disabled-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-disabled.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-limit.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen.c -- 2.41.0